Journal of Applied Psychology 


728 V- 


Edited by 


Donald G. Paterson 


University of Minnesota 


Consulting Editors 


George K. Bennett, Psycvalogical Corporation James P. Porter, Claverack, New York 
Harold E. Burtt, Ohi State University Harold F. Rothe, Fairbanks, Morse and Co., 
Allen L. Edwards, Univesity of Washington Belai Wis. l 

Clifford E. Jurgensen, Minneapolis Gas Co. Julian B. Rotter, Ohio State University 


: = 5 Edward K. Strong, Jr., Stanford University 
Irving Lorge, T. C. Cumbia University Donald E. Super, T. C Colada University 
Quinn McNemar, Staford University Morris S. Viteles, University of Pennsylvania 


Alexander Mintz, City Cllege of New York Alfred C. Welch, Knox-Reeves, Minneapolis 


Volume 37, 1953 


t Published Bi-montly by the American Psychological Association, Inc. 


Prise and Lemon Sts., Lancaster, Pa. 


Entered as second-class matter, must 19, 1943, at the post office at Lancaster, Pa., under the act of March 3, 1879 


ial rate of postage provided for in paragraph (d-2), Section 34.40, 


Acceptance for mailing the special CH gas) authorized October 10, 19 


Copyri 1953, by The American Psychological Association, Inc. 


Contents of Volume 37 


i Articles 
Anderson, S. B. Prediction and Practice Tests at the College Level............. 256 
Anderson, S. B. Estimating Grade Reliability...............00.0..00......00... 461 


Anikeeff, A. M. Factors Affecting Student Evaluation of College Faculty Members. 458 
Ash, P. and Hobaugh, T. R. Some Primary Ratable Characteristics of Instruc- 
Penal Ce AEE E E EE T ETE, EN EEE 293 
Baehr, M. E. A Simplified. Procedure for the Measurement of Employee Attitudes 163 
Bass, B. M., Klubeck, S. and Wurster, C. R. Factors Influencing Reliability and 


Validity of Leaderless Group Discussion Assessment........................ 26 
Bass, B. M. and Wurster, C. R. Effects of the Nature of the Problem on LGD 
Perr MANGE e 26 ss saxon POECI RIEN 54 FERRO ARAE a wows bone 96 
Bass, B. M. and Wurster, C. R. Effects of Company Rank on LGD Performance 
of Oil REANEFYy’SUPELVISOLES:, pa iasa enura ia SERES AUE A ee eee IEN 100 
Beaver, A. P. Kuder Interest Patterns of Student Nurses..................... 370 
Beaver, A. P. Personality Factors in Choice of Nursing........................ 374 
Belson, W. A. The Effect on Recall of Changing the Position of a Radio Advertise- 
ETE un Sea E oe oY Hummel bnoxecaareauumeiec epee mae 402 
Bendig, A. W. The Reliability of Self-Ratings as a Function of the Amount of 
Verbal Anchoring and of the Number of Categories on the Scale.............. 38 
Bernberg, R. E. Socio-Psychological Factors in Industrial Morale: II........... 249 
Bills, M. A. and Taylor, J.G. Over and Under Achievement in a Sales School in 
PANO to BULLE RIODUCHON s sa cxmnnios cheng TAE a e SEARA 21 
Bridge, L. and Morson, M. Item Validity of the Lee-Thorpe Occupational Interest 
Inventory... TETEN SN EAS CER EEEE ATE E E E Roe wae cy 380 


Brown, C. W. and Ghiselli, E. E. Prediction of Labor Turnover by Aptitude Tests. 9 
Brown, C. W. and Ghiselli, E. E. Per Cent Increase in Proficiency Resulting from 
Deef SClEOUVEID EVIE CS yat co nr a REFS ie REE daa Sanaa E 341 
Brown, C. W. and Ghiselli, E. E. The Prediction of Proficiency of Taxicab Drivers. 437 
Callis, R. The Efficiency of the Minnesota Teacher Attitude Inventory for Pre- 


dicting Interpersonal Relations in the Classroom.............-............., 82 
Canfield, A. A. Administering Form BB of the Kuder Preference Record, Half 
et E T Sere ere reer _ . for 
Canfield, A. A., Comrey, Av L. and Wilson, R. C. The Influence of Increase 
Positive g on Reaching Movements..............0. 62.0 sc eee eraser eces 230 
Canter, R. R., Jr. A Rating-Scoring Method for Free-Response Data....... P: 455 
Case, H. W. An Analysis of Engineering Entrance Examinations..............., 42 
Cattell, R. B. and Anderson, J.C. The Measurement of Personality and Behavior 
Disorders by the I. P. A. T. Music Preference Test......................... 446 
Chriswell, M. I. Validity of a Structural Dexterity Test...................0.... 13 
Clark, K. E. and Swanson, C. E. Attitudes Toward Public Low-Rent Housing, 
Before and After Construction........----++.----5- memes Sth TETELE ET 201 
Cohen, J., Vanderplas, J. M. and White, W.J. Effect of Viewing Angle and Parallax 
upon Accuracy of Reading a S E 482 


Coleman, W. An Economical Test Battery for Predicting Freshman Engineering 


Course Gradene i pi gees cosmos snag ge? En ed Ee FESRR EERE EE Ge ABE ae owe 465 
Comrey, A. L. Group Performance in a Manual Dexterity Task................ 207 
Drake. L E. Differential Sex Responses to Items of the MPL eee ee ae ey oe 46 

Hie he Minnesota Engineering Analogies Test................., 170 


Dunnette, M. D. T 


iii 


iv Contents of Volume 37 


i . 107 
Simplified Flesch Reading Ease Formulas.............. srrrraesenserro ce 
Edwards, A. L. The Relationship Between the Judged Desirability of a Trait and m 
the Probability that the Trait will be Endorsed.............._.. 


Fattu, N. A. and Mech, E. V. The Effect of Set on Performance in a “Trouble 
Shooting” Situation... l... 
Fleishman, E. A. The Description of Supervisory Behavior EL E T ish 
Fleishman, E. A. The Measurement of Leadership Attitudes in Industry. “3 3 
Fleishman, E. A. A Modified Administration Procedure for the O’Connor Finger 


191 


Standardized Tests... l... ESP En 
Friedman, N, The Quartile Difference Method of Item Selection.........--- ne . 
Garry, R. Individual Differences in Ability to Fake Vocational Interests... = 


Ghiselli, E. E. and Barthol, R. P, 


Selection of BOOP srr saison susanna T A T et A 
Gilliland, A. R. and Newman, S. E. The Humm-Wadsworth Temperament Scale 176 
as an Indicator of the “Problem” POPOTE s o ease aima aat e aani a 
Gough, H. G. The Construction of a Personality Scale to Predict Scholastic a 
BOTA orn e43 bern ers2étigivananeencanam .  csuaummemne’ f 
Gray, J. S., Sustare, G. and Thompson, A. An Apparatus for Measuring Opera- 57 
tional Hand Steadiness. |... T For Weasti S cues Pa 
Harris, S. and Smith, K. U. Dimensional Analysis of Motion: V. An Analytic a6 
Test of Psychomotor Ability... lL... ec e 445 
Hay, E. N. A Note on Small Samples... 0 75 
Hendrix, O. R. Predicting Success in Elementary Accounting. -e.tre ae 
Herzberg, F. and Russell, D. The Effects of Experience and Change of Job Inter 478 
arnt a a a 5 sions cus eerste etme ae ee 281 
Hofstaetter, P, R. The Actuality Measure in the Study of Public Opinion. Lee oneal 
Holland, Ji nore Krause, A. H., Nixon, M. E. and Trembath, M, F. The Classification 
of Occupations by Means of Kuder Interest Profiles: ĮI, The Development 263 
Interest Dr ie cones eraannm a, nn ao SER 53 
Irvine, D. A Note on Ranking Method E “ea 


Iscoe, I. and Lucier, O A ; ; n Scale o 
anb Ns ompar f -Verno z 
Val (1951) ‘ Parison of the Revised Allport ; 


i Kuder Preference Record (Personal 
Jenkins, J. J, Some Measured Chara ee 


and Success in Forecasting ee 


A A E E Ty dse 8 Go ea BEE M4524 974 van cow mpemaresie eet ae 78 
Jenkins, W. L. An Index of Selective Efficiency (S) for Evaluating a Selection Plan 


Jennings, E.E, The Motivati i ers 08 
55; i on Factor in Testing S i yinka n E ee oem 
Johnsgard, K, W. € > tng Supervisors... iform 
Alignment... eee Reusing as a Function of Pointer Symmetry and Vaio .. 407 
Kine} Rand ea Tots Wian Daad to Sloe ie Boas, gë Tao Workers. 348 
er, J. R. and Kinzer, L, G T he- 
matics. a > Predicting Grades in Advanced College Mat 
Kriedt, F; H. and Gadel, Mog gies RPS eS. cary ET EIER a 
P.H. e satel, M, S. Predict Sawn RG HE kers.. 338 
Kunnath, J. G. and Kerr, W, A. roan of Turnover Among Clerical ies Cot 
LaPorte Boards..." VA 1 n Analysis of Thirty-Two Ame 65 
awshe, C. H, and Nagle, B F. Pradsictiyity gia opat serzereas anneni 159 


Contents of Volume 37 


Layton, W. L. Predicting Success in Dental School.........-...--..----.55- 

LeShan, L. L. and Brame, J. B. A Note on Techniques in the Investigation of Acci- 
dent Prone Behavior. ........ 2.220: see eccentrics 

Lincoln, R. S. Visual Tracking: III. The Instrumental Dimension of Motion in 
Relation to Tracking Accuracy.........- 22.00. - see e eset teeters 

Long, L. and Perry, J. D. Academic Achievement in Engineering Related to Selec- 
tion Procedures and Interests... .... 22... 60 eee eee eres 

_ Longstaff, H. P. and Jurgensen, C. E. Fakability of the Jurgensen Classification 
TAVENI < ¢ earem thi he Mewes HEE ER LARRY a 

MacLean, A. G., Tait, A. T. and Catterall, C. D. The F Minus K Index on the 


Mausner, B. Studies in Social Interaction: III. Effect of Variation in One Part- 
ner’s Prestige on the Interaction of Observer Pairs..........-...+-.-.-.-5 
McGurk, F. C. J. Socio-Economic Status and Culturally-Weighted Test Scores of 

Negro Subjects... .- 2... -....ee sees e eee sean reedt teens 


McIntyre, C. J. The Validity of the Mooney Problem Check List.............. 


Moore, J. E. and Ross, L. W. The Changing of Mental Test Norms in a Southern 
Industtial Plants... 0. +e sswcccwen ss case trimidui sa oami iA eee 
Morgan, W. J. and Morgan, A. B. Logical Reasoning: With and Without Training. 
Mueser, R. E. The Weather and Other Factors Influencing Employee Punctu- 
MIEN. o on g possiaran on ease RES SRR ED UDP a ven miemonmine pre aiwa aa ai E ba Veminugiee 
Murray, J. E. An Evaluation of Two Experimental Charts as Navigational Aids 
to, Jet Pilots ccc > me noiri sh netietyRieIR rE k A asenne Aon to Hg na 
Navran, L. Validity of the Strong Vocational Interest Blank Nursing Key....... 
Nuckols, R. C. A Note on Pre-Testing Public Opinion Questions........... Ea 
Nuckols, R. C. A Study of Respondent Forewarning in Public Opinion Polls. . 7 
Oliver, J. E. A Punched Card Procedure for Use with Partial PADE ororo ' 
Parker, J. We Jr- Psychological and Personal History Data Related to Accident 
Records of Commercial Truck Drivers.. ...:- + -547s sss eee eee teen eens 
Peters, H. C. The Prediction of Success and Failure in Elementary Foreign Lan- 


guage Courses... -> e E A A A E S A 
Prothro, E. T. Identification of American, British, and Lebanese Cigarettes... ... 


Prothro, E. T. Identification of Cola Beverages Overseas... -ooo occon 
Remmers, H. H. and Kirk, R. B. Scalability and Validity of the Socio-Economic 
Status Items of the Purdue Opinion Panel... .. -e-o oeer e rtra 
Rock, M. L. Visual Performance as a Function of Low Photopic Brightness Levels 
Ross, S. and Fletcher, J. L. Response Time as an Indicator of Color Deficiency... 
Ross, S.. Ray, W. and Della Valle, L. Pointer Location and Accuracy of Dial 


Reading..." i 
Schneider, D. E. and Bayroff, A. G. The Relationship Between Rater Character- 


istics and Validity of AUN ES: ag anise ARI RMU 8 aa a min We aiaiaaeo es 
Schofield, W- A Study of Medical Students with the MMPI: III. Personality 
and Academic AEN E EE E S Memeeee ee ERS 


Shaffer, R. H. and Kuder, G. F. 


Business School Alumini. . o e canes ee aap zine k 
el, E. Flesch Readability Analysis of the Major Pre-Election 


Siegel, A. I. and Sieg 
Speeches of Eisenhower and Stevenson. -. >- -ct tti BEOTI 
Simpson, R. H. Rating Patterns for Maximizing Competition and Minimizing 


Number of Co mparative Judgments Necessary for Each Rater PAEL E ETC ON 
Smader, R. and Smith, K. U Dimensional Analysis of Motion: VI. The Com- 


ponent Movements 0 


F kepenibly WIDHQUE «nnn sensors es tere errr een enens 4 


v 


251 


79 


489 


vi Contents of Volume 37 


AIA E EELT Sop ah T E EA hea ea evi nceuins a eiee. a TTET z 
Stanley, J. C. Study of Values Profiles Adjusted for Sex and Variability Difference 
Swanson, C. E. and Fox, H. G. Validity of Readability Formulas............. 
Taylor, E. K. and Schneider, D. E. A Biasing Factor in Essay Response Frequency 
Tomlinson, H. and Preston, J.T. Development of a Short Test to Predict a Com- 

plex Aggregate Score... senumemeenrdnn 
Torrance, E. P. Methods of Conducting Critiques of Group Problem-Solving 

Performance 


Tydlaska, M. and Mengel, R. A Scale for Measuring Work Attitude for the MMPI 4 


Tyler, F. T. and Michaelis, J. U. A Comparison of Manual and College Norms for 
KaT AUE e AA A Y da feacecmemmemnscanic. rc amin 
Uhlaner, J. E., Gordon, D. A., Woods, I. A. and Zeidner, J. The Relationship 


Between Scotopic Visual Acuity and Acuity at Photopic and Mesopic Brightness 
Levels 


Weitz, J. and Nuckols, R. C. A Validation Study of “How Supervise?”......--- 
White, W. J., Warrick, M. J. and Grether, W. F. Instrument Reading III: Check 
Reading of Instrument GrOUNS Re ois apivain vane one deyeadccuawcs ot pReeem se 
Willerman, B. The Relation of Motivation and Skill to Active and Passive Partici- 
pation in the Gron ne 


Woods, W. A. Influence of Ink Color on Handwriting of Normal and Psychiatric 
Groups 


Book Reviews 


Argyris’ An Introduction to Field Theory and Interaction Theory: H. J. Eysenck. < 


Arsenian’s In Memoriam—Rudolf Pintner: Donald G. Paterson 

Barlow’s Mental Prodigies: Lewis M. Terman... ove ee es TO 
Campbell’s Practical Applications of Democratic Administration: Hugh M. Shafer 
Curran’s Counseling in Catt 


Deese’s The Psychology of Learning: O. Hobart Mowrer.., |..........-002 000° 


Division of Occupational Analysis, United States Employment Service's Dictionary 
5 of Occupational Titles, Second Edition: Alan M. Kershner....-----0 0000"? 
ooher and Marquis’ The Development of Executive Talent: C. G. Browne...--- 


, : Abraham S. Levine... -eooo oca 
urnbull’s Personnel Admin 


Karn and Gil b p : y: Forrest L. Donik eene ao pan 
Readings in Fereti “ee $ | Business Psychology, and Blum’s 
Kelly and Fiske’s The Pred; SHOlog yes Philip Ely IIe Be ss seess mass 
Jacobs is ance in Clinical Psychology : Stanley E. 


a“) 


Contents of Volume 37 ' vii 
Kephart's The Employment Interview in Industry : Harold) E. Burtt. «see << cme 239 
Laird and Laird’s Practical Sales Psychology: S. Rains Wallace, Jr..........-.... 324 
Lauer’s Learning to Drive Safely: Stanley E. Jacobs..........-...+++.0s-eeeuee 242 
Maier’s Principles of Human Relations, Applications to Management: Wilton P. 
Chere iia: e eoan ERE 489 FBS BASRA Ow G E E ESERE aE + pine tae errant 432 
Miller and Form’s Industrial Sociology; An Introduction to the Sociology of Work 
Relations: Glaister A. Elmer. . .cc. <casssecia ss eet DANa i seas. 20 SATSENE 59 
Parker and Kleemeier’s Human Relations in Supervision: William E. Kendall.... 62 
Prasad’s Fatigue and Efficiency in Textile Industry: Harold F. Rothe............ 242 
Shostrom and Brammer’s The Dynamics of the Counseling Process: John W. 
CORSE og waKetONS EEG Ed EO Fo CUO BHR Sew RoE ee Re E ESIN 243 
Steiner’s A Practical Guide for Troubled People: Harold Seashore..............- 500 
Ulrich, Booz and Lawrence's Management Behavior and Foreman Attitude: 
Theodore R. LiidboMms isse shee: equanewre os cases cee ee AE Ea Hee 433 
Walker and Guest's The Man on the Assembly Line: John M. Cook............ 324 
Wechsler’s The Range of Human Capacities: James J. Jenkins.................. 240 
Weinland and Goss’ Personnel Interviewing: Clifford E. Jurgensen.............. 434 


Welch and Stone’s How to Build a Merchandise Knowledge Test: Edwin E. Ghiselli 64 

Wolfle, Buxton, Cofer, Gustad, MacLeod, and McKeachie’s Improving Undergrad- 
uate Instruction in Psychology: Sidney’ Le Pressey, cisse e exwnnmomneneaaiis a awie 147 

Zaleznik's Foreman Training in a Growing Enterprise: Theodore R. Lindbom..... 63 


Applied Psychology in Action 


Bills, M. A. Our Expanding Responsibilities... .........6. esse cee eee tenes ssaa 142 
Hadley, H. D. The Non-Directive Approach in Advertising Appeals............ 496 
Lindbom, T. R. Evaluating Supervisory Training at the Job Performance Level.. 428 
Vallance, T. R., Glickman, A. S., and Suci, G. J. Criterion Rationale for a Personnel 


Research Progra a s pairin is sae oo ay e TAREE aS Ss KES e adatna KORY Bl i 429 
A New Management Tool for Top Executives sru cessati aty pha dOn aN 321 
Background of an Industrial Psychologist... . «cemns akaa ati cee qe DAST ASES 321 
How's Your Empathy? mre vipilleitihis wis ig ed dub ET EEE LETERE EOE n ESTAT ETTT 431 
Job Supervision Oh Young WOPKE Seena raau aa V RRERRA Gy es eee eh oy SSG 236 
News Item... ..--- Epema ssp een SAN AEREE EA ETA KTERE a d 146 
Noise and Absenteeism... ....... 0.0... eee erent tettt ortt rttr iitereereneneens 322 
Personnel Psychology in a Steel Company. ..---- +++ 00505 s+ sees erent tees eens 238 
Reading: Stop Wasting Your Time... .....- 6.0. sc eee e eee ete ee cee eee nes 498 

Miscellaneous 

New Books, Monographs, and Pamphlets......+....--...-.+5. 151, 247, 328, 435, 502 


a 


Journal of Applied Psycholog 


o 5 
A’ 
\~ 


VoL. 37, No. 1 


FEBRUARY, 1953 


The Description of Supervisory Behavior * 


Edwin A. Fleishman 


USAF Air Training Command, Human Resources Research Center, 
Lackland Air Force Base, Texas ** 


Previous research in the area of leadership 
has to a large extent been concerned with 
postulated traits that leaders should possess, 
or with over-all evaluations of leadership. 
The leader’s actual behavior has been largely 
ignored. More recent research has concluded 
that leadership is to a great extent situational, 
and that what is effective leadership in one 
situation may be ineffective in another. It 
therefore seems desirable to have available a 
method of describing leadership behavior 
which can be applied to many different situa- 
ations. If this were possible then different 
leadership patterns could be related to criteria 
of effectiveness in a wide variety of group 
situations in which leaders function. 

There have been some recent attempts to 
develop methods for the description of aan 
ship behavior. This article is pper 
one such attempt which was carried out within 
the framework of the Leadership Studies at 
the Personnel Research Board of Ohio oe 
University. The primary emphasis in : i 
article will be to describe the Taa 
a Supervisory Behavior Description ques 
naire for use in an industrial situation. 
Developmental Background of the Instrument 

The Leader Behavior DEAT: A 
Supervisory Behavior Description is e 
the Leader Behavior Description Q 
carried out while the writer was 


arch Board, Ohio State Uni- 


* This study was 
with the International Har- 


at the Personnel Rese 
versity, in cooperation : 
vester Company. Skills Research La ora- 
to = Perceptual sag aaa fusions contained 1 fe 
ae e opi ey are 

report are those of hi E orenk 0 
construed as reflecting the view 

the Department of th 


naire originally developed by Hemphill and 
the staff at the Personnel Research Board (2). 
The questionnaire contained 150 items which 
described kow people in leadership positions 
operate in their leadership role. The re- 
spondent marked for each item, how fre- 
quently the leader did what each item de- 
scribed (e.g., always, often, occasionally, 
seldom, never). 

A major problem in this endeavor was the 
classification of the items into meaningful 
categories of leader behavior. The 150 items 
were derived from over 1,800 original items 
which were written and then classified by 
“expert judges” into the following nine a 
priori “dimensions” of leadership behavior: 

1. Integration—acts which tend to increase 
cooperation among group members or decrease 
cooperation among them. 

2. Communication,—acts which increase the 
understanding and knowledge about what is 
going on in the group. 

3. Production emphasis,—acts which are 
oriented toward volume of work accom- 
plished. 

4. Representation,—acts which speak for 
the group in interaction with outside agencies. 

5. Fraternization—acts which tend to make 
the leader a part of the group. 

6. Organization——acts which lead to dif- 
ferentiation of duties and which prescribe 
ways of doing things. 


1An earlier approach at the Personnel Research 
Board developed modified job analyses procedures 
for investigating types of organizational activities 
engaged in by persons in high organizational positions 
These methods have been summarized by Stogdill 
and Shartle (5) and by Shartle (4). 


p) Edwin A. Fleishman 


7. Evaluation, —acts which have to do with 
distribution of rewards (or punishment). 

8. Initiation—acts which lead to changes 
in group activities. 

9. Domination—acts which disregard the 
ideas or persons of members of the group. 

An example of an item assigned to the Inte- 
gration area was “He encourages group mem- 
bers to work as a team.” An example of one 
assigned to the Domination area was “He 
insists that everything be done his way.” 

Subsequent administration of the form 
yielded adequate reliabilities for the nine di- 
mension scores (.71 to .88) when groups filled 
it out as describing their own leader. More- 
over, group members were consistent in how 
they described the same leader. However, the 
striking feature of repeated use of the ques- 
tionnaire in various types of situations was 
the lack of independence of the dimensions. 
Most of the intercorrelations were between .50 
and .80. 

Item analysis also showed that an item as- 
signed to one dimension by a priori methods 
might just as easily correlate more highly with 
Scores on dimensions to which the item was 
not assigned. Some reorganization of the 
items, into relatively more independent cate- 


gories of leader behavior, therefore, seemed 
necessary, 


Factor Analysis and Revision of the Leader 
Behavior Description. In order to identify 
empirically the factor structure of the ques- 
tionnaire, a factor analysis of the items was 
undertaken.’ The questionnaire was admin- 
istered to 300 Air Force crew members who 
described their airplane commanders. The 
Wherry-Gaylord Iterative Factor Analysis 
Procedure (6, 7) was 
of the items. 


structure. The 
factors Present, 


3 This Procedure does no 


correlations, but starts with item-sub- 


tors. The major factors were defined as “Con- 
sideration” and “Initiating Structure.” 

Items in the “Consideration” dimension 
were concerned with the extent to which the 
leader was considerate of his workers’ feelings. 
It reflected the “human relations” aspects of 
group leadership. 

Items in the “Initiating Structure” dimen- 
sion reflected the extent to which the leader 
defined or facilitated group interactions to- 
ward goal attainment. He does this by plan- 
ning, communicating, scheduling, criticizing, 
trying out new ideas, etc. 

The minor factors were tentatively labeled 


“Production Emphasis” and “Social Sensi- 
tivity.” 


Pre-Test on an Industrial Population 


New keys were developed to score | the 
questionnaire along these factor dimensions. 
Items with the highest loadings and purest 
factor structure were selected for each key. 
It was felt that scoring the questionnaire 
along these four dimensions would yield lower 
intercorrelations between the dimensions and 
would thus give measures of more independent 
aspects of the leader’s behavior. A 136-item 
Supervisory Behavior Description question- 
naire was administered to a pre-test sample 
of 100 International Harvester foremen at the 
Company’s Central School in Chicago, These 
foremen, representing 17 different plants, used 
the questionnaire to describe the behavior of 
their own supervisors. The questionnaires 
were scored along the new factor dimensions 
derived from the Air Force sample. The ae 
pose of this industrial pilot-study was to fin 
out how applicable these new scales were to 
the industrial sample, and to determine what 
further revision might be necessary. 

Dimension Reliabilities and Intercorr ela- 
tions. Intercorrelations of the diners 
scores showed that they still had substantia 
overlap with one another when applied to this 
industrial population. The intercorrelations 
Were between .56 and .80, with corrected split- 
half reliabilities between .77 and .95. It 
Seemed possible that the categories of leader 
behavior which were most independent in this 
industrial sample might be somewhat different 


The Description of Supervisory Behavior 3 
Table 1 
Items Selected for the Revised Form of the Supervisory Behavior Description! 
Orthogonal Factor Orthogonal Factor 
Loading Loading 
“Consid- “Initiating “Consid- “Initiating 
eration” Structure” eration” Structure” 
“Consideration” “Consideration” 
Revised Key Revised Key 
He refuses to give in when people He criticizes a specific act rather 
disagree with him. —.68 .06 than a particular individual. .63 14 
*He does personal favors for the He is willing to make changes. 78 09 
foremen under him. 40 -06 He makes those under him feel at 
He expresses appreciation when ease when talking with him. 86 17 
one of us does a good job. .70 19 He is friendly and can be easily 
He is easy to understand. 70 13 approached. 82 — 02 
*He demands more than we can He puts suggestions that are made 
do. —.40 —.08 by foremen under him into oper- 
*He helps his foremen with their ation. 87 ali 
personal problems. 32 05 He gets the approval of his fore- 
*He criticizes his foremen in front men on important matters before 
of others. =49 .03 going ahead. 65 —.02 
He stands up for his foremen even 
though it makes him unpopular. 54 -08 “Initiating Structure” 
He insists that everything be done Revised Key 
his way. —.52 —.01 ** He encourages overtime work. -20 40 
He sees that a foreman is re- *He tries out his new ideas. —.10 42 
warded for a job well done. 70 05 He rules with an iron hand. —.20 58 
He rejects suggestions for changes. —.62 —.06 He criticizes poor work. —.18 59 
*He changes the duties of people **He talks about how much 
under him without first talking should be done. 3 =i 60 
it over with them. —.69 09 *He encourages slow-working E 
He treats people under him with- 5 foremen to greater effort. 17 33 
out considering their feelings. —.72 Al He waits for his foremen to push A 
He tries to keep the foremen new ideas before he does. —.07 —.28 
under him in good standing with He panies people under him to 
those in higher authority. 68 AT = Fess: ted .00 26 
He resists changes in ways of d He asks — ae ai his 
doing things. —.57 19 cine e good of the entire 7 2 
“He “rides” pi oerang —.61 37 He insists that his foremen follow 
A e eehh his actions: —.72 23 See og of doing things in x 
: 7 every detail. 25 i 
*He acts without consulting his UET ot Te dees o ies Chats pesplenunder 72 
foremen first. 3 ne o him are working up to their limits. —.17 .87 
“He stresses the importa der *He offers new approaches to 
high morale among those un = ler problems. : 36 29 
iim i in their He insists that he be informed on 
He backs up his foremen 1n .62 .16 decisions made by foremen under 
actions. E 5. =i him. .13 i 
He is slow to accept new He is s. He lets others do their work the 
He treats all his foremen as 1s e. 28 way they think best. -f Eoy 


equal. 


$ cays: 2, often; 3. occasionally; 4. seldom; 5. never. Items preceded 
format: 1. alw DR ay often; <3 occasionally; 4. once ‘in awhile; 5. very seldom. 
1. a great deal; 2. fairly much; 3. to some degree; 


1 Items not starred used the eae tee 
by an asterisk (*) used the format: ; 


see the format: 
Items preceded by a dou (aust 
4. comparatively little; 5. 


4 Edwin A. Fleishman 


Table 1—continued 


Orthogonal Factor 


Loading 
“Consid- “Initiating 
eration” Structure” 


“Tnitiating Structure” 
Revised Key 
**He stresses being ahead of com- 
peting work groups. 03 St 
**He “needles” foremen under 


him for greater effort. —.17 .50 
He decides in detail what shall be 
done and how it shall be done. 37 63 
**He emphasizes meeting of dead- 
lines, -10 68 
*He asks foremen who have slow 
groups to get more out of their 
groups. = 22 40 
**He emphasizes the quantity of 
work. AI 51 


from those found most independent in the Air 
Force data. 

Item Analysis. In order to clarify this 
problem and to revise the questionnaire for in- 
dustrial use a statistical analysis was carried 
out at the item level. Two kinds of informa- 
tion were obtained concerning each of the 136 
items in the Supervisory Behavior Description 
questionnaire. First, the distributions of re- 
sponses among the five choices for each item 
were considered. Second, tetrachoric correla- 
tions of every item with each dimension total 
Score were calculated to give indices of the in- 
ternal consistency of the dimensions and to 
reveal the sources of overlap between the di- 
mensions. Thus, coefficients were not only 
computed between an item and its own dimen- 
sion total score, but with each of the other 
three dimension total scores to which it had 
not been assigned. 

This analysis revealed that most of the 
items correlated highly with the dimension to 
which they were assigned. However, it was 
also evident that most of the items correlated 
highly with one or more dimensions to which 
they were not assigned. 

Following the Wherry-Gaylord rationale (6 
7), the item-dimension correlations were ogie 
sidered factor loadings of the items on the four 
oblique (correlated) dimensions. In order to 


compare the loadings with those obtained from 
the Air Force population, transformation to 
orthogonality was accomplished and it ap- 
peared, by inspection, that this transformation 
brought the loadings more in line with the 
original factors derived from the factor analy- 
sis. Item loadings increased on dimensions to 
which they were assigned and decreased on 
other dimensions. This seemed especially true 
for the two major factors (Consideration and 
Initiating Structure). Further preliminary 
rotations were then made with the primary 
objective of rotating the items originally in 
the two minor factors into more independent 
clusters. It appeared that this might not be 
possible, and in the light of the high correla- 
tions between these factors and the other two, 
their utility as separate dimensions was ques- 
tioned for this population. Practically all the 
variation could be accounted for by the two 
major dimensions. 

The Revised Questionnaire. Based on the 
item-dimension loadings derived from this in- 
dustrial population, two revised scoring keyi 
were developed, —one for “Consideration an 
one for “Initiating Structure.” Criteria for 
item inclusion were: (1) the item should have 
a high loading with the dimension in which it 
was to be included; (2) the item should have 
as close to zero loading as possible on the 
other factor; and (3) items which did not 
discriminate among supervisors (most Te 
spondents picking the same alternates) were 
rejected. F 

Twenty-eight items best meeting these ctl 
teria for “Consideration” and 20 items for 
“Initiating Structure” were selected. Table 
1 presents the items finally selected for the 
revised form. The loadings given are thos? 
derived from this industrial population. 

Tt can be seen that most of the items a 
signed to each key have high loadings mi 
that dimension and insignificant loadings with 
the other. In addition, one more step WS 
carried out. It was possible to select items 
for the “Initiating Structure” key so that some 
items had small negative loadings on “Con- 
sideration,” and others had small positive 
loadings on “Consideration.” It was hoped 
that the total effect of this would be to cancel 
out further the unwanted variance in the “In- 


ee eee 


The Description of Supervisory Behavior 5 


itiating Structure” key due to these cumula- 
tive small loadings on “Consideration.” 

The items in each key were, as before, ran- 
domly distributed through the questionnaire. 


Administration of the Revised Form 


This 48-item revised Supervisory Behavior 
Description was then administered to another 
comparable sample of 122 foremen in one of 
the International Harvester Company’s plants. 
Again they were to describe their own super- 
visors. Assurances were again given that no 
one in the company would see their answers. 

Table 2 presents some of the results. 

From the results on this sample, it appeared 
that the scores on the two dimensions were 
now independent of each other. i 

Another index of the utility of the instru- 
ment is the agreement among different re- 
spondents who describe the same supervisor's 
behavior. The variation in scores can be di- 
vided into that between descriptions of dif- 
ferent supervisors and that within descriptions 
of the same supervisor. This “within descrip- 
tion” variation represents lack of agreement 
between respondents describing the same su- 
pervisor. The analysis of variance revealed 
significantly less variation among descriptions 
of the same supervisor than between descrip- 


Table 2 


tions, Range, Reliabilities, and 
he Dimension Scores of the 
Behavior Description 


Means, Standard Devia 
Intercorrelations of t 
Revised Supervisory 


(N = 122) 
id- Initiating 
Coria Structure 
20 
No. of Items 28 a 
Mean 82.3 oe 
Stande iation 15.5 ; 
Rage = 22 to 106 13 to 68 
68 
i 92 
Reliability? 9: pm 


Intercorrelation 


i i yere 
1In this form, the alternatives a each en ia 
weighted from zero to four, Gon and 80 for Initiating 
Score was 112 for Consideration an š 
of eac. 
Splita correlations corrected ta ell eset 
dimension by the Spearman- TO 


tions of different supervisors.‘ This appears 
to be further evidence of the objectivity of 
this questionnaire procedure. 

The questionnaire was also administered to 
a sample of 394 workers who described the be- 
havior of their own foreman. In this case the 
reliabilities of the scales were .98 and .78 and 
the correlation between them was —.33. It 
will be recalled that the pre-test sample con- 
sisted of people at the foreman level, so it 
might be expected that the correlation be- 
tween dimensions would be somewhat higher 
in this sample of workers. This correlation 
is still considerably lower than had been 
obtained between dimensions with previous 
forms of the instrument. An analysis of vari- 
ance again revealed significant agreement 
among workers describing the same foreman.® 

It appeared that the two dimensions isolated 
were quite meaningful in this industrial situa- 
tion. Apparently a supervisor could be high 
in Consideration without necessarily being 
high or low in the amount of planning, push- 
ing for production, scheduling, or initiating 
behavior engaged in. At least the usual “halo 
effect” from scale to scale that occurs in most 
instruments in this area, seems for the most 
part to have been eliminated. The independ- 
ence of the dimensions has special relevance 
when one considers the relationships of each 
of the two dimensions with some external cri- 
teria of group effectiveness. 

The development of external criteria of 
group effectiveness was far beyond the scope 
of the present study. However, the Industrial 
Relations Department of the plant did have 
available the number of labor grievances filed 
for each of 23 departments during an eight- 
month period. These were reduced to griev- 
ances per worker for each department and cor- 
related against the mean scores derived (from 
descriptions by foremen) for the general fore- 
man in charge of each department. Although 
the N of 23 is pitifully small, and the records 
attenuated by many uncontrollable factors, 


4Peters and Van Voorhis (3) suggest the con- 
version of F ratios to epsilon (€), a statistic which 
indicates the strength of relationship, For these 
results €=.65(P<.01) for Consideration and 
.47(P<.05>.01) for Initiating Structure. 
5e=.72(P<.01) for Consideration an 
01) for Initiating Structure. d se 


6 Edwin A. Fleishman 


correlations of —.43 with “Consideration” and 
-26 with “Structure” were obtained. Only the 
first coefficient is statistically significant. The 
trend, however, was for the high grievance de- 
partments to be those with supervisors lower 
in consideration and higher in the amount 
of structuring in their leadership behavior. 
These results, of course, are purely suggestive. 
A more highly controlled criterion study of 
group effectiveness, and relationships to these 
dimensions is a program of future research. 

The instruments have also been found use- 
ful in evaluating a leadership training program 
for foremen in the company and in studying 
relationships of leader behavior with certain 
factors in the social situation in which the 
foremen operate (1). 


Summary 


This paper has described the development 
of one approach to the problem of describing 
leadership behavior in industry. A question- 
naire, based on earlier work by Hemphill, was 
constructed. By means of this questionnaire 
the leadership behavior of supervisors could 
be objectively described. The questionnaire 
measures two relatively independent leader- 
ship dimensions found meaningful in the in- 
dustrial situation “Consideration” and “Ini- 
tiating Structure.” 

There is no implication in the study as to 
the degree of each kind of behavior that is 
desirable or undesirable. Recognizing the 


situational nature of leadership, the need for 
relating these scales to effectiveness of par- 
ticular kinds of groups in well-controlled cri- 
terion studies is stressed. Moreover, the study 
reported here was confined to supervisors in 
one particular company. 

The questionnaire at present is regarded 
only as a research instrument for the study of 
leadership behavior. More research applying 
the scales to other industrial situations needs 
to be done before they can be more confidently 
assessed. 


Received May 5, 1952. 


References 


1. Fleishman, E. A. Leadership climate and super- 
visory behavior, Personnel Research Board, 
Ohio State University, 1951. > 
2. Hemphill, J. K. Leader behavior description. 
Personnel Research Board, Ohio State Univer- 
sity, 1950. Bar 
3. Peters, C. C., and Van Voorhis, W. R. Statistical 
procedures and their mathematical bases. New 
York: McGraw-Hill, 1940. g 
4. Shartle, C. L. Leadership and executive perform 
ance. Personnel, 1949, 25, 370-80. f 
5. Stogdill, R. M., and Shartle, C. L. Methods ° 
determining patterns of leadership behavior 1 
relation to organization structure and objec- 
tives. J. appl. Psychol., 1948, 32, 286-91. R 
6. Wherry, R. J., Campbell, J. T., and Perloff, * 
An empirical verification of the Wherry-Gay- 
lord iterative factor analysis procedure. Psy 
chometrika, 1951, 16, 67-74. t 
7. Wherry, R. J., and Gaylord, R. H. The concep! 
of test and item reliability in relation to fac 
tor pattern, Psychometrika, 1943, 8, 247-64. 


a 


Tue JOURNAL or APPLIED PsycHoLocy 
Vol. 37, No. 1, 1953 


A Validation Study of “How Supervise?” 


Joseph Weitz and Robert C. Nuckols 


Life Insurance Agency Management Association, Hartford, Conn. 


The problem of whether or not “How Super- 
vise?” is an intelligence test has been discussed 
in several recent articles. Millard * has briefly 
discussed these studies and has presented ad- 
ditional data showing a relationship between 
this test and intelligence. We have been in- 
terested in determining whether or not “How 
Supervise?” is a test of supervisory ability and 
incidentally have some findings which may be 
relevant to its relations with intelligence. 


Procedure 


A modification of “How Supervise?” was 
taken by 78 District Managers in one life in- 
surance company. These districts are located 
throughout most of the southern and border 
states, The Managers supervise and direct 
the work of varying numbers of agents, rang- 


ing from 8 to 100. 
By arrangement with the Psychological Cor- 


poration the test was modified by taking items 
from forms A and B and combining them into 
a test of 100 items. We used 20 items from 
the section Supervisory Practices, 32 items 
from Company Policy, and 48 items from 
S i inions. 

"The pele changes made in the test were 
these; “Admitting to your workers hen yp 
make a wrong decision,” was changed to J ; 
mitting to your agents when you ma He 
Wrong decision.” “Requiring aie ale - 
submit in writing their reasons ponie X 
penalizing any employee,” Was © ST 
“Requiring Managers to submit a n 
their reasons -for fring Or pena ae EN 
agent.” These were the only changea an 
that is, substituting “Manager” for ae 
visor” and “agent” for worker” or 

” 3 

E 100 items were put together into one 


test which we called the Manager’s P: 
: ‘ g3 manager 5 
This test was mailed on of them returned 


we mentioned earlier, 
vise? an intelligence 


1 Millard, K. A. Is How Sube wae 
test? J. appl. Psychol., 1952, 30, 


completed questionnaires. Here the second 
difference occurred, that of a change in the 
testing conditions. The instructions for each 
section were the same as in the original test 
with the exception that “Manager” was sub- 
stituted for “supervisor.” However, it was 
truly self-administered with no time limit. 
The Managers signed the questionnaire. This 
permitted validation against certain criteria 
data for each district. 


The Criteria 


Many different criteria were used. These 
included three production criteria: production 
of ordinary insurance, industrial insurance, 
and ordinary increase. (For those of us who 
do not know much about insurance terminol- 
ogy, it should suffice to say that these are 
three measures related to volume of sales.) 
We used as another criterion the number of 
men who terminated in each district during 
1951. This figure was corrected for size of 
the district. We also used the four-year turn- 
over, again corrected for district size, for the 
period 1947 through 1950. (This criterion 
has on odd-even year reliability of .77.) An- 
other criterion was the persistency of the busi- 
ness sold, i.e., the average lapse ratio for each 
district. This might be thought of as the 
quality of the business. 

In addition to the above criteria we had 
certain biographical data on each Manager. 
The only part of this information which we 
will discuss in the present paper is the highest 
school grade completed. 


Scoring of the Test 


A number of different scores were obtained 
for each part and for the total test. For the 
total and for each of the parts we obtained the 
number of items right, the number wrong, the 
number right minus the number wrong, and 
the number of question marks. The correct 
answers were obtained by using the key origi- 


8 Joseph Weitz and Robert C. Nuckols 


Table 1 


Correlation of Scores vs. Criteria Measures 


inary inary Industrial , , 
EA A TD Increase „„ 1951 oY ear Hype oe 
Per Man Per Man Per Man Per Man Turnover Turnover F 
umber Right , 
> I 5 —.08 19 .03 04 ll g E- 
Part II —.12 10 05 05 ll a ee 
Part II —.01 12 02 —.10 —.06 —.19 Si 
Total —.08 .09 —.01 —.10 —.11 —.26 ` 
Number Wron ` 
Part I i 12 07 09 09 07 08 = i 
Part II -26 .00 23 14 —.13 14 se 
Part III 20 —.06 16 19 —.05 -00 ies 
Total .28 —.02 24 21 —.08 .09 aes 
Right-Wron 
Part T j —.12 ll 06 07 —.11 —.18 pe 
Part II —.20 .06 AS 10 OL —.22 ae 
Part III —.10 ll —.06 —.15 —.02 —.13 = 
Total —.19 07 —.12 —.17 —.02 —.23 5 
Number of ? 
Part I 01 —.24 —.03 —.01 07 -14 = 
Part II .09 ll =.14 —.06 23 as E 
Part III —.15 —.10 —.16 —.04 M2 .22 B 
Total —.12 —.08 —.15 —.05 Aid wa $ 


nally devised for each of the appropriate items 
in “How Supervise?” 


Results 


The results are shown in Table 1. It can 
be seen that most of the correlations are be- 
low the five per cent level of significance with 
the exception of the scores vs. education where 
more of the correlations are above the five per 
cent level than could be expected by chance 
alone. 

After finding no over-all significant rela- 
tionship between the scores and the criteria, 
we did an item analysis on half of the cases. 
Using high and low district termination rate 
as the criteria we isolated those items, about 
twenty in all, which seemed to differentiate 
these two groups to some extent. In those 


items which differentiated the groups, the an- 


.22 significant at 5% level. 
-29 significant at 1% level. 


swers predominately given by the low ages 
tion group were scored as correct. We 7 
applied our new scoring key to the other | ru 
of the sample. It did not cross-validate; 
the other half of the sample there ie rn 
relationship between the score obtained W 
the new key and termination rate. 


Conclusion 


If the minor modifications of the test a 
not change “How Supervise?” materialin i 
would look as if this test is not valid in wi 
situation for predicting agent turnover OY Ra 
duction, both of which we feel should be w 
lated to supervisory ability. From our rest i 
the only thing this test seems to relate tO 
educational (intelligence?) achievement. 


Received November 28, 1952. 
Early publication. 


Tue JOURNAL or APPLIED PsycHoLocy 
Vol. 37, No. 1, 1953 


Prediction of Labor Turnover by Aptitude Tests 


Clarence W. Brown and Edwin E. Ghiselli 


University of California, Berkeley, California 


With the popularization of the finding from 
World War I of a positive relationship be- 
tween occupational level and intelligence test 
score, the notion developed that for each oc- 
cupation there is an optimal level of intelli- 
gence. This belief led to a series of studies, 
particularly during the 1920's, which in gen- 
eral showed a curvilinear relationship to exist 
between scores on intelligence tests and labor 
turnover. Those individuals on a particular 
job who earn intelligence test scores at ap- 
proximately the average of the group tested 
tend to remain on the job a longer time than 
those who earn scores at either extreme (e.g. 
5). 

ait attention has been given to the prob- 
lem of the relationship between labor turnover 
and scores on types of tests other than those 
of intelligence. With tests of “specific” apti- 
tudes the primary interest has been in dis- 
covering the correlations between test scores 
and some criterion measures of job proficiency 
or success in training. It is possible that the 
criterion of length of service on the job might 
have a curvilinear relationship with scores on 
specific aptitude tests as length of service has 
been shown to have with intelligence tests. If 
this were the case then there would be reason 
to question the notion that optimal intelligence 
test scores for various jobs are indicative of 
the “intellectual requirements” of the jobs. 
Intellectual factors other than those measured 
by intelligence tests would have to be con- 
sidered. The study reported here was under- 
taken to investigate the nature of the omar 
ship between scores on tests not ordinarily 
considered intelligence tests and labor turn- 


Over. 
Methods and Procedure 
The subjects used in the present a 
tion were taxicab drivers. At the time they 


applied for work they were administered 2 
number of tests as a part of the hiring = 
cedure, To some extent the scores on t 


tests were taken into account in the decision 
regarding employment. But other factors 
such as age, nature of previous experience, and 
scores on an interest questionnaire also entered 
into the hiring decision. 

Those men who were ultimately hired were 
divided into two groups, those who stayed on 
the job for three months or more and those 
who left in less than three months. No dif- 
ferentiation was made between individuals 
who were separated for cause and those who 
left voluntarily. The number of enforced 
separations was very small, and resignations 
appeared in some cases not to be wholly volun- 
tary but rather as a means for avoiding dis- 
ciplinary action. The only individuals not 
included were those who were terminated be- 
cause of illness, called to the armed services, 
or transferred to other jobs within the com- 
pany. 

All of the tests utilized were of the paper 
and pencil variety. The tests are listed in the 
accompanying tables. All three arithmetic 
tests involved computations but varied in the 
complexity of the problems presented. The 
Speed of Reactions tests involved making dif- 
ferential checking responses in accordance 
with pre-established rules to presentations of 
letter stimuli varying in spatial organization. 
In Test I the rules were given on each page 
and in Test II the rules had to be remem- 
bered. In the Dotting and Tapping tests, 
scores were based upon the speed with which 
dots were placed in small printed circles by 
means of a pencil. In the Dotting test, preci- 
sion of movement was more of a factor than 
in the Tapping test because the circles were 
much smaller in size. The Judgment of Dis- 
tance test required judgments about the rela- 
tive distance of pictured objects based pri- 
marily on cues of perspective and interposition 
of objects. The Distance Discrimination test 
required judgments about the relative lengths 
of lines. In the Mechanical Principles test, 
problems involving knowledge of mechanical 


10 Clarence W. Brown and Edwin E. Ghiselli 


functions and principles were presented. A 
more detailed description of these tests has 
been given elsewhere (2). 

All men did not take all tests. In the pres- 
ent analysis the numbers of cases per test 
ranged from 218 to 441. Scores on each test 
were transmuted into normal standard scores 
on a nine-point scale following the procedure 
utilized in the Aviation Psychology Program 
of the Air Force (3). In standardizing the 
tests on this scale all applicants were utilized, 
whether they were hired or not. The distribu- 
tions of scores of cases utilized in the subse- 
quent analyses are given in Table 1. 


Results 


For each score on the various tests the per 
cent of individuals leaving the job in less than 
three months is given in Table 2. These data 
are shown graphically in Figure 1. For three 
of the tests, Speed of Reactions II, Judgment 
of Distance, and Mechanical Principles, no re- 
lationship of any kind is apparent between 
test scores and turnover. For the remaining 
tests, curvilinear relationships occur and tend 
to be of the U-shaped kind found with in- 
telligence tests, that is, individuals earning 
either high or low scores are more likely to 
quit the job sooner than those earning scores 
in the middle of the range. 

The most consistent and striking relation- 
ship between test scores and turnover holds for 


Table 1 


Numbers of Hired Taxicab Drivers Earning Various 
Scores on the Several Tests 


Score 


Test Seas 5 6) 7 o 


Complex Arithmetic 24 66 38 41 25 24 
Intermediate Arithmetic 26 34 56 48 30 29 
Simple Arithmetic 37 41 47 44 23 26 
Speed of Reactions I 53 77 93 83 62 52 
Speed of Reactions IL 55 82 94 78 58 53 
Dotting 80 89 88 84 53 47 
Tapping 76 96 91 76 57 45 
Judgment of Distance 62 47 67 87 34 35 
Distance Discrimination 72 61 105 92 56 55 
Mechanical Principles 56 51 87 52 44 


Table 2 


Per Cent of Taxicab Drivers Leaving Their Jobs in 
Less Than Three Months in Relation to 
Scores on Various Tests 


Score 


1to Sand 
Test 34 8 6 @ 8 


Complex Arithmetic 42 30 37° 17 20 33 
Intermediate Arithmetic 62 35 14 29 40 50 
Simple Arithmetic 30 39 23 27 30 31 
Speed of Reactions I 40 31 33 27 35 42 
Speed of Reactions II 35 34 33 33 33 36 
Dotting 41 39 25 32 26 40 
Tapping 38 38 30 29 28 44 
Judgment of Distance 31 40 37 40 35 37° 
Distance Discrimination 32 41 29 32 34 44 
Mechanical Principles 43 24 38 42 25 40 


the Intermediate Arithmetic test. For Com- 
plex Arithmetic, Simple Arithmetic, Speed of 
Reactions I, Dotting, Tapping, and Distance 
Discrimination, scores and turnover seem to 
be correlated to about the same degree. No 
relationship is apparent between turnover an 

scores on Speed of Reactions II, Judgment ° 
Distance, and Mechanical Principles. 

Scores on all three of the arithmetic tests 
are related to turnover, as are scores on three 
of the four speeds tests (Speed of Reactions 
I, Dotting, and Tapping), and one of the tw? 
spatial tests (Distance Discrimination). It 1 
therefore difficult to arrive at any generaliza- 
tion concerning the general factors in the tests 
which give the best forecast of turnover. 

On the seven tests that are related to tur? 
over, the optimal score varies between 5 and 6- 
It is 5 or very close to that value for Inter- 
mediate Arithmetic and Simple Arithmetic, 
about 6 for Complex Arithmetic, Speed of Re- 
actions I, Dotting, and Tapping; and 5.5 for 
Distance Discrimination. Thus the optima 


score on each of these tests is equivalent to Oe 


a little higher than the average score of this 
particular group of applicants. 


Discussion 


_ From the findings of the present study, ny 
is apparent that scores on some tests which 1” 
content are quite different from intelligen©® 
tests are related to labor turnover in the same 


Prediction of Labor Turnover by Aptitude Tests 


Percent Leaving 


11 


(=) 4,751.6: 7 '8° 45678 fe-4, 3 6 7° B= I'S: 
6 - = 5 
3 Score on 9 3 Score on 9 3 Score on 9 3 Score w 8 Pai 8 
Complex Intermediate Simple Speed of Speed + d 
Arithmetic Arithmetic Arithmetic Reactions I Reactions I 
70 
60 
z 
£ 50 
f 
S aoo p P 
: zA l e] ° ? 
S 
5 20 
à 
10 
o 456 8 
r 2 Bo E :3)_67 (16> paso Te I-45 s T 
3 Score on 9 3 Score on 9 3 Score on 9 3 ecu on 5 Taan a 
Dotting Tapping Judgement of Distance Mechanical 
Distance Discrimination Principles 
Fic. 1. Per cent of taxicab drivers leaving their jobs in less than three months in 


relation to scores on various tests. 


manner as are scores on intelligence tests. 
Not only is the nature of the relationship the 
same, as a U-shaped relation, but the optimal 
scores, scores where turnover is at a mini- 
mum, fall at about the same place in the 
distribution of scores as do the optimal scores 
on intelligence tests. Some of the tests util- 
ized in the present investigation, such as tap- 


ping and dotting, obviously measure quite 
simple functions which are unrelated to those 
measured by ordinary intelligence tests. The 
nature of the relationship with turnover, how- 


ever, is the same. s pi 
The optimal scores on intelligence tests 


found in previous investigations, together with 
the relationship between intelligence test 
scores and occupational level, have been taken 
- to indicate the “intellectual requirements” of 
jobs. In the present study we find the same 
kind of optimal scores with tests quite dif- 
ferent from intelligence tests. Furthermore it 
has been found that even with tests of simple 
functions a similar relationship exists between 
scores and occupational level (4). It is ne 
altogether certain, then, that the hierarci 
levels of occupations with respect to intelli- 


gence test score are to be accounted for solely 
on the basis of “intellectual requirements.” 
Finally, it can be pointed out that in some E 
stances turnover and intelligence test scores 
though correlated are not related in the U- 
shaped fashion. In Table 3 are data we have 
reported in a different form elsewhere con- 
cerning the relationship between intelligence 
test scores and turnover among bus drivers 
(1). In this occupation the large number of 
terminations was again the result of volun- 
tary separation. From Table 3 it can be seen 
that turnover is at a minimum at the high 


Table 3 


Turnover Among Bus Drivers in Relationshi 
B nsh; 
Intelligence Test Score RER 


% Leaving Under 


Score N 6 Months 
50-60 24 33 
40-49 67 49 
30-39 85 57 
20-29 40 60 
0-19 13 62 


12 Clarence W. Brown and Edwin E. Ghiselli 


score levels and as the scores decrease there 
is an increasing rate of turnover. 

It is not clear just exactly what generaliza- 
tions can be drawn concerning the nature of 
the relationship between test scores and turn- 
over. Certainly the use of the concept of “in- 
tellectual requirements” does not seem to be 
a satisfactory explanation. That is, the idea 
that turnover is a function of the distance, 
either plus or minus, of the person’s intelli- 
gence from the mean intelligence for the job 
is not necessarily borne out. Just what types 
of aptitude tests give satisfactory predictions 
of turnover and what the nature of the rela- 
tionship is between turnover and various apti- 
tude qualifications is still obscure. 


Summary 


Ten tests measuring several kinds of apti- 
tudes were administered to groups of 218 to 
441 taxicab drivers. For seven of the tests a 
U-shaped relationship was found between test 
scores and turnover, those individuals earning 
either high or low scores being more likely to 


leave the job than those earning scores around 
the average of the group. Since this relation- 
ship is very similar to that found between 
scores on intelligence tests and turnover, it 
was concluded that the notion of “intellectual 
requirements” as an explanation of the U- 
shaped relationship between turnover and in- 
telligence test scores is not wholly satisfactory. 


Received May 2, 1952. 


References 


1. Brown, C. W., and Ghiselli, E. E. Factors re- 
lated to the proficiency of motor coach opera- 
tors. J. appl. Psychol., 1947, 31, 477-479. p 
2. Brown, C. W., and Ghiselli, E. E. Age of semi- 
skilled workers in relation to abilities and 
interests. Personnel Psychol., 1949, 2, 497-511. 
3. Flanagan, J. C. The Aviation Psychology Pro- 
gram in the Army Air Forces. Report No. 1, 
1948, U. S. Gov’t Print. Office. i 
4. Ghiselli, E. E, Intelligence test use in vocational 
guidance. In Kaplan, O. J. (Ed.) Encyclo- 
pedia of Vocational Guidance, Phil. Library, 
1948, 
. Viteles, M. S. Selection of cashiers and predicting 
length of service. J. Personnel Res., 1924, 2, 
467-473, 


u 


THE JOURNAL OF APPLIED PsycHoLocy 
Vol. 37, No. 1, 1953 


Validity of A Structural Dexterity Test 


M. Irving Chriswell 
Buffalo Technical High School, Buffalo, N. Y. 


A test of Structural Dexterity emerged after 
a long period of testing in Technical High 
School of Buffalo, New York. Earlier ex- 
periments with the assembly of mechanical 
objects gave way to the construction of pro- 
gressively complex structures of three dimen- 
sions. 

In this test two different lengths of metal 
bars and pins comprised the unit parts. These 
were manually inserted and built upon a board 
divided into sections with holes drilled for 
each unit structure. The subject built each 
Structure by interpreting the size and posi- 
tion of the parts from perspective sketches 
presented on a card. Features of the test 
follow: 


hotograph of appa 
Fic, 1. P 13 


tus and sketches of the Structural Dexterity Test. 
ra 


1. A configuration of holes was adopted 
which became the basis for twelve different 
structures. The complete test utilized all 
areas of the board twice. 

2. The progression from simple to complex 
structure gradually advanced the subject from 
one to two, and then from two to three level 
structures; from right angle to oblique posi- 
tions; and from firmly built structures to 
movable balanced structures which required 
greater finger dexterity. 

3. The score was determined by adding up 
the total number of pins and bars correctly 
placed. Deductions were made for errors. 


Testing time: 14 minutes. See accompanying 
photo in Figure 1. 


14 M. Irving Chriswell 


The Criteria 


Five criteria were developed for this test. 
The first criterion, (C,), was the average of 
two ratings by a machine shop instructor. 
These ratings were not upon any specific job 
or project, but covered specific working traits, 
after three or four months of shop work. The 
second criterion, (C,), was the time in clock 
hours for the student to complete an assigned 
project of a small “C” clamp. Instructions 
were uniform and a detailed drawing and list 
of operations were furnished each student. 
The third criterion, (C,), was an averaged 
evaluation of layout, precision, and quality of 
work on the same “C” clamp by two judges, 
A and B, who were uninformed of the shop 
experience and behavior of individual stu- 
dents. The following scale provided the ob- 
jective evaluation: 


Clamp screw 
. Threading 
. Knurling 
. Total length 
. Knurl length 
. Hole, drilled 
. Thread tested 
- Chamfer 


NAW PWN 


PEORAECED o a eseis 
Clamp 
+ Outside contour 
- Inside contour 
. Filing 
Finish 
. Hole, true 


oe 


Total score 


Total, clamp and screw 


Outside and inside contours were j i i 
r judged with the aid 
of a special steel template on a 3%” tolerance basis. 


The fourth criterion, (C,), was the averaged 
evaluation, C,, plus a time bonus. This bonus 
was developed from the time in hours for each 
Job and was determined by the shop instruc- 
tors: it weighted time compared with quality 
of work on a 1:2 ratio, The fifth criterion 
(C) was the shop teacher’s evaluation on the 
objective scale plus the time bonus. 


The Group and the Measures Used 


A group was chosen which could be readily 
and precisely rated on their shop work. All 
pupils registered in first year Machine Shop 
were selected. This comprised a sub-group of 
62 students in the 9th grade of the Electrical 
Course and another sub-group of 38 10th 
grade students of the Mechanical Course. 
Scores were available for these sub-groups on 
the following measures: Henmon-Nelson In- 
telligence Quotient, (IQ); Space and Nu- 
merical Ability, (DS and DN); on the Dif- 
ferential Aptitude Tests, (DAT); The Struc- 
tural Dexterity Test described, (SD); the 
Purdue Pegboard, (PP), using the total score 
of Right plus Left plus Both Hands; and 4 
test of Repetitive Operations, (RO), com- 
prising nuts and bolts to be fastened to a block 
with the aid of an end wrench. 

A comparison of the 9th and 10 grade sub- 
groups was undertaken and the means of the 
scores on the Structural Dexterity and the 
Purdue Pegboard tests were found to be sig- 
nificantly greater for the 10th grade than for 
the 9th grade sub-group. SD correlated with 
C,....44 for the 9th grade and .17 for the 
10th grade sub-group. The Purdue Pegboard 
as well as both Differential Aptitude tests gave 
consistently low correlations (.08 to .30) with 
C, for both sub-groups. A definite age differ 
ence of one year and three months existed be 
tween the sub-groups, and a marked difference 
in age correlations appeared: Age with Cs 
gave .15 for 9th grade and —.32 for 10th 
grade sub-group. Since these were the only 
unusual differences noted in the sub-groups» 
the combination seemed justifiable. 


Reliability and Validity 


Evidence of the reliability of the criteri@ 
was obtained. As previously stated, the thir 

and fourth criteria were based upon the evalu- 
ations of two judges. The fifth criterion W4S 
based upon the evaluation of the shop instruc” 
tor. Correlations of these evaluations follow 


Shop 
Judge B Instructor 
Judge A 78 76 
Judge B = 172 


Validity of a Structural Dexterity Test 15 
Table 1 
Intercorrelations in the Prediction of Several Shop Success Criteria. Pearson formula 
used for all coefficients. N = 100 

SD RO FP DN DS IQ Age Ci Ce Cs Cy Cs 
sD AS At als .29 .16 18 38 —.38 30 Al 1 
RO 19 -—05 —.05 .12 00 25 —.43 .20 35 34 
PP —.02 14 14 .25 10 =31 17 26 27 
DN AS 48 —.05 19 —.05 04 08  —.03 
DS 29 .11 .29 —.38 Eki 25 .21 
Q i —.37 01 —.11 -00 -09 08 
Age 08 —.13 Al 30 34 
ron = 23 43 49 39 
Ca —.38 —.78 —.69 
CG 82 76 
G 87 
Cs 


Using the correlation between Judges A and 
B, the Spearman-Brown formula yields a co- 
efficient of .87 for the group of 100 students 
evaluated. This may be considered the re- 
liability of the third criterion, and the mini- 
mum reliability of the fourth criterion. 

The reliability of the SD test was deter- 
mined by a method similar to the split-half 
technique. The coefficient for a group of 92 
students in 9th and 10th grades was -88. Ap- 
plying the Spearman-Brown formula the en- 
tire test would give .94- 
_ An intercorrelation p 
is presented in Table 1. . eh 

With the exception of Cs; the time aitai 
it is significant that the Structural pe? 
Test has higher validity than any of the ot t 
selected tests. More significant results mig 


be obtained with age held constant. 


f factors and criteria 


Summary 


1. A test of structural dexterity za os 
hificant differentiation in the paom A 
9th and 10th grade technical high sc ee 
dents. The reliability by odd eves oe eee 
employing the Spearman-Brown or 

‘94 for a group of 92 students. 


2. This test appears to be a valid. measure 
of mechanical ability in a limited sense. It is 
a definite aid in the prediction of general ma- 
chine shop success. The correlation for 100 
subjects with averaged shop instructors’ rat- 
ings on specific shop traits was .38; with time 
in hours to complete a specific job —.38; with: 
averaged evaluation of a specific job by two 
judges .30; with this averaged evaluation plus 
a time bonus .41; and with a shop instructor’s 
evaluation plus a time bonus .51. 

3. This test of structural dexterity shows 
significant overlap with a test of repetitive 
operations (.48) and with the Purdue Peg- 
board, Right plus Left plus Both Hands score, 
(44). 

4. With multiple correlation formula, based 
upon the data presented, it was found that 
four selected factors, Structural Dexterity, 
Repetitive Operations, Space Relations (Dif- 
ferential Aptitude Battery) and Age, predicted 
the averaged evaluation plus the time bonus 
to the extent of .53. The multiple correlation 
between the same four factors and a shop in- 
structor’s evaluation plus the time bonus was 
Bais 


Received March 24, 1952. 


Tue JOURNAL or APPLIED PSYCHOLOGY 
Vol. 37, No. 1, 1953 


The Changing of Mental 


Test Norms in a Southern 


Industrial. Plant 


Joseph E. Moore 
Georgia Institute of Technology 


and 


Laurence W. Ross 


Union Bag and Paper Corporation, Savannah, Georgia 


In 1947 Bennett and Wesman (1) pre- 
sented certain scores, which had been accumu- 
lated by Union Bag and Paper Corporation of 
Savannah, Georgia, on white men and women 
job applicants. The authors stated that for 
a given population local norms were the most 
meaningful. They also pointed out the often 
occurring problem of differences between local 
norms and “national” norms on which the test 

was originally based. 

The problem of changes occurring in a given 
plant population from year to year naturally 
arises. It was our hope that the present study 
would throw some light on this subject. Does 
the level of performance, as measured by the 
Revised Beta Examination, remain relatively 
stationary for job applicants over a period of 
four or five years in a particular industrial 
plant? 

The data on which this study was based 
cover a period from 1947 to 1951 and include 
all white men and women applicants who were 
given the Revised Beta Examination. The 
Union Bag and Paper Corporation requires all 
applicants who pass a preliminary interview to 
take a battery of psychological tests one of 
which is the Revised Beta Examination. 

The subjects used in this study were 8,818 
white men and 5,288 white women who ap- 
plied for work at the Union Bag and Paper 
Corporation between the years 1946 and 1951. 
The average score (all scores used in this 
study are unweighted) earned by the men on 
the Beta Test was 83.8; the Standard Devia- 
tion for these scores was 15.9. The median 
score for the men was 84.5. The Stanford 
Binet Mental Age equivalent for this group of 
men applicants is 14 years (2). The Otis 
Self-Administering Test, Higher Examination, 


Form A, score equivalent for the average of 
our group would be 33 points. 

The average score for the women was 77.4 
with a Standard Deviation of 15.8. The 
median for this group was 78.4. The Stanford 
Binet Mental Age equivalent for the women 
is 13 years, 1 month. A comparable score ON 
the Otis Self-Administering Test, Higher EX- 
amination, Form A, would be 23. , 

The scores on the Revised Beta Examina- 
tion for the groups in this study were com 
pared with the scores obtained by Bennett ae, 
Wesman in an earlier study in the same plant 
in 1947 (1). Table 1 presents the reliability 
of the difference between the means of thesé 
groups of men applicants. 


Table 1 


s 
Reliability of the Difference Between Mean Score 
on The Revised Beta Examination for 
Men Industrial Applicants 


Num- à a” 
Group ber Mean S.D. 
Bennett & Wesman ie 
(1947) 1,362 80.5 17.7 632 
Moore & Ross z 
(1951) 8818 83.8 15.9 


** Significant at .01 level of confidence. 


In Table 1 it will be seen that the w 
groups of men industrial applicants are BE 
tically significantly different. The mean ail 
tal test scores, however, earned by the E 
men applicants is only 3.3 points higher t T 
the mean of the 1947 group studied by ri 
nett and Wesman. This is less than one-Î i 
of the Bennett and Wesman S.D. of 1% 

16 


Changing of Mental Test Norms in Southern Industrial Plant 17 


Table 2 


A Comparison of The Mean Scores on the Revised 
Beta Examination of Two Groups of 
Women Applicants 


Num- 
Group ber Mean S.D. “t 
Bennett & Wesman 
(1947) 1,083 729 17.5 7.75** 
Moore & Ross 
(1951) 5,288 774 15.8 


** Significant at the .01 level of confidence. 


The 1951 mean is at the 55 percentile point on 
the 1947 percentile norms. ; 

In Table 2 it will be seen that a difference 
in the mean scores of the women applicants 
on the Revised Beta Examination has also 
occurred. The 1951 group is performing at 
a higher level on the Beta Test than was true 
of the 1947 group. The 4.5 points difference 
in the mean is slightly larger for the women 
applicants than was found in the case of the 
men applicants. This 4.5 point difference is 
about one-fourth of the Bennett and Wesman 
S.D. of 17.5. The 1951 mean is also at the 55 
Percentile point on the 1947 percentile norms. 


Summary and Conclusion 


reported on men and 


B Jesman i 
ennett and We ost the Revised Beta 


Women applicants who t 


Examination prior to 1947. The data from 
these two investigators were compared with 
men and women applicants who took the ex- 
amination between the years 1947 and 1951. 

The present study shows that the men and 
women applicants seeking employment at this 
paper plant between 1947 and 1951 earned 
statistically significantly higher scores than 
did the group seeking employment prior to 
1947. The difference between the mean scores 
of both the men applicants and the women ap- 
plicants was, however, not striking, being 
about one-fourth of the 1947 S.D. The 1951 
mean is at the 55 percentile point on the 1947 
percentile norms. 

The direction of the change is upward to- 
wards applicants who perform in such a way 
that they earn higher test scores on the Re- 
vised Beta Examination. The reason for these 
changes lies beyond the scope of this study. 


Received April 14, 1952. 


References 


1. Bennett, George K., and Wesman, Alexander C. 
Industrial test norms for a southern plant pop- 
ulation. J. appl. Psychol., 1947, 31, 241-246. 

2. Kellogg, C. E., and Morton, N. W. Manual, re- 
vised beta examination, New York: The Psy- 
chological Corporation. 


THE JOURNAL OF APPLIED PSYCHOLOGY 
Vol. 37, No. 1, 1953 


The Validity of Personality Inventories in the Selection of 
Employees 


Edwin E. Ghiselli and Richard P. Barthol 


University of California, Berkeley, California 


Industrial and governmental organizations 
have for many years utilized tests of various 
kinds as aids in the selection of employees. 
Certain types of tests, e.g., aptitude, profi- 
ciency, and intelligence tests, have been shown 
to have merit in improving selection tech- 
niques. In recent years personnel workers 
have become increasingly conscious of the im- 
portance of personality factors as contributors 
to employee satisfaction or unrest. Personal- 
ity tests and inventories have been used to 
supplement the subjective evaluation of these 
factors by the employment interviewers. A 
number of studies have been reported on the 
validity of personality inventories as selec- 
tion devices, but these have been widely scat- 
tered through the literature. The purpose of 
this report is to summarize these studies so 
that the usefulness of the personality inven- 
tory can be more easily assessed. 


Methods and Procedure 


In order to secure pertinent information we 
searched the various professional journals and 
books published from 1919 to date. From 
each study which reported findings concerning 
the validity of personality inventories for em- 
ployment purposes, we noted the validity co- 
efficient, the number of cases, and the job on 
which the group was employed. The studies 
included in the present analysis were re- 
stricted to those conducted in the United 
States, and to those in which the criterion was 
some index of job proficiency such as produc- 
tion records or ratings by superiors. An at- 
tempt was also made to include only those 
studies in which the scoring key for the per- 
sonality inventory was developed independ- 
ently of the group for which the validity co- 
efficient was reported. Approximately 40% 
of the material reported in this paper is un- 
published, having been drawn from various 


business, industrial, and governmental organ- 
izations. 


18 


In selecting the data for this study we €x- 
amined the articles reporting the use of per- 
sonality inventories and excluded those report 
ing traits that appeared to have little or 7° 
importance for the job in question. Thus; 
an inventory designed to measure sociability 
would be included for sales persons but not 
for machinists. We have grouped together al 
the remaining inventories regardless of the 
trait presumably measured. This was neces- 
sary because many utilize trait names that are 
very broad or not in general use, and some 
inventories do not name the trait at all. 


Results 


In order to show the general trends tig 
weighted mean validity coefficient was coe 
puted through Fisher's z for each of the maJ° 
occupational groups. These values, toget p 
with the numbers of cases and numbers k 
validity coefficients, are given in Table 1. T A. 
distribution of the validity coefficients by °° 
cupation are shown in Figure 1. 

There have apparently been few S 
made on the efficacy of personality inven i 
for higher level supervisors. Contrary tO g 
pectations, the mean validity coefficient a 
only .14 is low and the distribution is $°™ 


tudies 
tories 


Table 1 


; ege y it 
Weighted Mean Validity Coefficients of Personality 
Inventories for Various Occupational Groups 


Total Total 

Mean No. of No. of 3 
r Cases rs Occupation 
14 518 8 General Supervise 
-18 6433 44 Foremen 
:25 1069 22 Clerks 
-36 1120 8 Sales Clerks 
36 927 12 Salesmen ers 
24 536 5 Protective work 
16 385 6 Service Workers 
-29 Sil 8 Trades and Crafts 


‘than 50 subjects with a su 


Validity of Personality Inventories in Selection oj Employees 19 


GENERAL SUPERVISORS 


FOREMEN 


On-15-49 
N=50-99 
BB N= 100+ 


#2 
44m 
-60 -60 -40 -20 00 +20 


SALES CLERKS 
+40 +60 +80 


-s0 -60 -40 -20 00 +20 


peere a Aen 
Zo +40 +60 +80 


-60 -60 "40 720 00 + 


PROTECTIVE WORKERS 
720 +40 +60 +80 


-60 -60 -40 ~20 00 a 
SERVICE WORKERS 


geese ao BO o T20 FAO HED ive 


TRADES AND CRAFTS 
so 320 +40 760 +80 


Gor EEO manor ane, i 
80 =60 =40 Gaidity Coefficient 


y coefficients of 


P EN validit 
Fic, 1, Distribution of occupational 


Personality inventories for various 
groups. 


one case of fewer 


wh. here is i 
at scattered. T bstantial coefficient 


of correlation. 

There were many studies reported on ar 
men which support the conclusion er pe 
Sonality inventories On the average ms 
have much predictive value 1 selecting 


in- 

visory employees. The mean and oas 

cide at .18. Apparently certain 1 od pre- 
used under certain conditions 8'V° pp 

dictive results. ; 

The studies made on cleric 

dicate that reasonably good Pre 


diet | tories. 
Made on the basis of personality op of co- 
he mean value of .25 oe 65 demonstrate 


efficients ranging from 50 b seriously 
that this type of inventory ee for the 
Considered in devising 4 test battery 
Selection of these workers. 
For both of the sales 8" 
and salesmen, quite substant 
een found. While there Þa 


al workers in- 
dictions can be 


oups, sales clerks 
jal validities have 
ve not been as 


many studies with these groups as might be 
expected the findings are fairly consistent. 
For both the mean validity coefficient is .36. 

We found only five studies in which scores 
on personality inventories were related to pro- 
ficiency among protective occupations. How- 
ever, all of these studies utilized sizeable num- 
bers of cases and are quite consistent in 
indicating moderate validity. The mean co- 
efficient is .24. 

In the studies of service workers the find- 
ings are quite inconsistent. Since the validity 
coefficients range from —.40 to +.50, the low 
mean validity coefficient for this occupational 
group cannot be considered a representative 
indication of the effectiveness of personality 
inventories. It appears that under certain 
circumstances inventories may be used effec- 
tively. 

The few applications of personality inven- 
tories to skilled workers have given quite 
promising results. The average of the valid- 
ity coefficients for the trades and crafts is 
29. Furthermore, the findings from different 
studies are quite consistent. 


Discussion 


We were able to discover a total of 113 
studies dealing with the validity of personality 
inventories in employee selection. When one 
recalls that these studies are spread over a 
number of different occupations it is apparent 
that the amount of information available for 
the evaluation of inventories is by no means 
extensive. However, a similar survey of re- 
ports concerning the validity of intelligence 
tests, certainly a much more popular instru- 
ment in employee selection, revealed only 
some 450 studies. Thus while in absolute 
terms the data may appear to be scanty, as 
compared to those available for other types 
of tests, they are fairly satisfactory. 

It has been found that under certain cir- 
cumstances scores on personality inventories 
correlate better with proficiency on a wider 
variety of jobs than might be expected. On 
the other hand there have been enough studies 
reporting negative results to emphasize cau- 
tion in their use. These inventories have 
proved to be effective for some occupations in 
which personality factors would appear to be 


20 Edwin E. Ghiselli and Richard P. Barthol 


10. Holmes, F. J. Validity of tests for insurance 
office personnel, I. Personnel Psychol., 1950, 
3, 57-69. 

11. Holmes, F. J. 
office personnel, II. 
3, 217-220. 

12. Jurgensen, C. E. Report on the Classification 


of minimal importance (e.g., clerks, and trades 
and crafts), and ineffective for other occupa- 
tions in which these factors could reasonably 
be expected to be of paramount importance 
(e.g., supervisors and foremen). 


Validity of tests for insurance 
Personnel Psychol., 1950; 


Received May 12, 1952. 


. Harrell, T. W. Testing cotton mill supervisors. 
J. appl. Psychol., 1940, 24, 31-35. 


Inventory, a personality test for industrial 


use. J. appl. Psychol., 1944, 28, 445-460. , 
References 13. Kenagy, H. G., and Yoakum, C. S. The selection 
and training of salesman. McGraw-Hill, 1925. 
b Beckman, R. O., and Levine, M. Selecting execu- 14. Knauft, E. B. A selection battery for bake shop 
tives. Personnel J., 1929-30, 8, 415-420. managers. J. appl. Psychol., 1949, 33, 304- 
. Beckman, R. O. Ascendance-submission test, re- 315. 
vised. Personnel J., 1932, 11, 387-392. 15. Kurtz, A. K. Recent research in the selection of 
. Diehl, H.’ S., and Paterson, D. G. A personnel life insurance salesman. J. appl. Psychol 
study of Duluth policemen. Bull. Employ. 1941, 25, 11-17. r 
Stabl. Res. Inst. Univ. of Minn., II, No, 2, 16 McMurry, R. M. Efficiency, work-satisfaction 
1933. and neurotic tendency. Personnel J., 1932, 11, 
. Dodge, A. F. Social dominance and sales per- 210-211. ai an il 
sonality. J. appl. Psychol., 1938, 22, 132-139. 17: Reat, IJ. Ability to sell. Williams and Wil- 
5. Dodge, A. F. What are ‘the personality traits of * se, E i 
successful clerical workers? J. appl. Psychol., 18. Sartain, A. Q. Relation between scores on gerion 
1940, 24, 576-586. standard tests and supervisory success M 30 
$ Dotti, R. w, and Jones, M. H. Handbook of re factory. J. appl. Psychol, 1946, 9% 
hte pag ae ea 19. Schultz, R. S. Standardized tests and statistical 
E d aii irkpatrick, F. H. Intelligence procedures in selection of life insurance sales 
a ea measurements in the selection personnel. J. appl. Psychol., 1936, 20 553-566. 
ae 3 . Ps} 1 : 
1045, 29 cate J. appl. Psychol, 20, Shultz, I. R., and Barnabas, B. Testing for en 
, a ership in i ¢ Acad. 90» 
. Freyd, M. Selection and promotion of salesmen. pet aa Lope ARAN, ea ee 
48, s 
J. person. Res., 1926, 5, 142-146, 21. Stead, W. H., Shartle, C. L., et al, Occupational 


counseling techniques. New York: American 


Book Co., 1940. 


THE JOURNAL or APPLIED PsycnoLocy 
Vol. S7, No. 1, 1983 0  CnorooY 


Over and Under Achievement in a Sales School in Relation 
to Future Production 


Marion A. Bills and Jean G. Taylor 
Ætna Life Affiliated Companies, Hartjord, Conn. 


Beginning January 1, 1947, and based on 
Previous experimental data, the Life Agency 
Department of the Ætna Life Affiliated Com- 
panies decided that it would require all ap- 
Plicants for selling positions to take three 
tests. These were: (1) Strong’s Vocational 
Interest Blank; (2) the Aptitude Index pub- 
lished by the Life Insurance Agency Manage- 
Ment Association (a scoring of an application 
blank and a personality test); and (3) LOMA 

est 1-A, a mental alertness test published 
by the Life Office Management Association. 

The Life Agency Department has regularly 
Conducted schools for the training of agents. 

ese schools have had various purposes and 
entrance requirements, but one series of 
Schools conducted between January 1, 1947 
and October 1, 1949, known as the “basic 
Schools,” were primarily for new agents and 
nO previous selling experience or production 
Was required for admittance. It has been 
Noted from the first that there was a definite 
relationship between the LOMA TA test 
‘Cores and the grades earned in the schools 

Ut that this relationship was by no means 
Perfect, In addition, those who did better 
in the school than their test scores would in- 

‘cate, seemed to be more successful in future 
Selling, However, this result was not nt 
Statistically until this year largely because "© 
tmation came too late in our selection pro- 


ur + nefit. However, 
“Ure to be of material be ine 


, might prove valuable in 
Cases, and since a large 
ases have been accumulate 
tistical study, we feel that it 
S to report certain of our resu 
sip Keep the group as homos ple 
Studi except for the two vari 
grag O LOMA 1-A test SCOM ose who had 
Scone), WE limited the grouP a 
Sn an “A” on the Life 
ng’s Vocational Interest 


enough 5 
d to justify a 
is advantage- 


Its. 


Me 
Of © 
Sta 
o 


on the Aptitude Index and had attended a 
“basic school.” There were 91 agents who 
met these requirements. 

The grades in the “basic school” for these 
91 agents ranged from 80 to 98 (S.D. = 3.96) 
with a mean of 90. LOMA 1-A test scores 
ranged from 99 to 209 (S.D. = 20.94) with a 
mean of 146. The correlation (product mo- 
ment) between LOMA 1-A test scores and 
school grades was .64, between discrepancies 
and grade was .77, and between discrepancies 
and LOMA 1-A test scores was —.01. From 
the regression equation a predicted school 
grade was derived for each LOMA 1-A test 
score. This predicted score was then com- 
pared to the actual grade received to give an 
“index-of-achievement” score for each of the 
91 individuals (actual grades minus predicted 
grades). This “index-of-achievement” score 
ranged from +7.8 to —6.6 and had a mean 
of 0 and a standard deviation of 3.03. Those 
receiving positive scores were considered 
“over” achievers, and those receiving negative 
scores “under” achievers. 

The achievement scores were divided into 
three groups with extreme “over” achievers 
falling +3.0 and over, and extreme “under” 
achievers —3.0 and under. Table 1 gives the 
results of a comparison between the achieve- 
ment scores and a combined criterion of length 
of service and premium production during the 
first year. 

A Chi Square test for Table 1 yields a value 
of 15.44 which, with four degrees of freedom, 
is significant at the .01 level. It is evident 
that extreme “over” achievers, those with a 
score of +3.00 or over, in contrast to extreme 
“under” achievers, those with scores of —3.00 
or less, tend more frequently to remain at 
least a year and to be higher producers. The 
eine representative who was made an Agency 
Assistant before the end of the first year was 
an extreme “over” achiever and fell in the 


22 Marion A. Bills 


and Jean G. Taylor 


Table 1 


Over and Under Achievement in the 


Sales School versus Length of Service and 


Premium Production in the First Year 


Per Cent of Agents Per Cent of Agents Per Cent of Agents 
eet Who Terminated Who Remained Who a 
(Actual School Grade Prior to 1 Year 1 Year and 1 Year aa 
Minus Predicted or Produced Less Produced Between I roduced 
School Grade) N Than $2500 or Both $2500-$4999 $5000 or Over 
+3.00 and over 18 22% 39% 3% 
+2.99 to —2.99 57 51 37 z 
—3.00 and under 16 81 13 
Table 2 


Over and Under Achievement in the Sales School versus Length of Service and 
Total Two-Year Production 


Index of 


Per Cent of Agents Per Cent of Agents Per Cent of Agents 
Achievement Who Terminated Who Remained Who Remained d 
(Actual School Grade Before the End of 2 Years and Years and Produces 
Minus Predicted 2 Years or Produced Produced $10,000 or Over ota 
School Grade) N Less Than $5000 $5000-$9999 Became Agency Ass! 
+3.00 and over 18 22% 33% 45% 
+2.99 to —2.99 57 55 30 16 
—3.00 and under 16 75 19 6 


* Persons charged with agency management resp 
Insurance for their own accounts. 


highest production group even with only 
eleven months of production represented. 

In addition to success in the first year as 
treated in Table 1 we were also interested in 
following the same line of approach with total 
production during two years. Using the same 
breakdown of achievement scores but with a 
different division of the combined criterion of 


length of service and premium production, 
Table 2 was constructed. 


Ta 


Over and Under Achievement in the Sales School 


eneas, x A Life 
onsibilities and not engaged primarily in the sale of Lif 


A Chi Square based on Table 2 is 14.00 
which, with four degrees of freedom, is signif; 
cant at the .01 level. Table 2 indicates tha 
the same general results persist over a tW0- 
year period. ô 

In the above discussion, -+3.00 and S 
were chosen as points where we could be rea 
sonably sure that no chance variation in the 
school grading would account for the diten 
ence. However, it is of interest to note tha 


ble 3 


versus First Year Production 


First Yea 


No. and Per Cent of 


r Production or Total Agents Who 


F Production if Left U; d Remained 12 
ESN Twelve Months -0 Months br Me 
ctual a 
School Grade Under $2500 §2500-$4000 $5000 rde et) 
o Predicted N N % 
: o N % N % N % 
Over 49 16 33 ?, 
es E E. 


Over and Under Achievement in a Sales School 23 
Table 4 
Over and Under Achievement in the Sales School versus Total Two Years’ Production 
Total Two-Year Production or Total No. and Per Cent of 
Production if Left Prior to gae Contracted 
E vo Y emained a 
; End of Two Years Tat Dae e Made 
Relation $10,000 (Includes Agency Agency 
at mp 0-$5000 $5000-$9999 and Over Assistants) RR 
chool Grade as ol 
to Predicted N N % N % Ne % N % July, 1952) 
Over 49 17 35% 16 33% 16 32% 33 67% 11 
Under 42 28 67 13 31 1 2 16 38 0 


We obtain the same general results if the break 
etween “over” and “under” achievers is made 

at zero although we find, as would be expected, 

that the differences are not as pronounced. 
hese results are given in Tables 3 and 4. 


Summary 


The results reported in this study indicate 
lat agents who receive a score of “A” on the 
ife Insurance Scale of Strong’s Vocational 
"terest Blank, a score of “A” on the Aptitude 


Index, and who achieve a higher grade in the 
“basic school” than would be predicted from 
their LOMA score: (1) remained with the 
company longer; (2) produced more paid 
premiums; and (3) were promoted to super- 
visory positions oftener than the agents who 
did not achieve a “basic school” grade as high 
as their LOMA test score would predict. 


Received September 11, 1952. 
Early publication. 


Tue JOURNAL or APPLIED PSYCHOLOGY 
Vol. 37, No. 1, 1953 


A Personality Study of Professional and Student Actors * 


Chalmers L. Stacey 


Syracuse University 


For a long time it has been the contention 
of the authors that a great deal of frustration 
and heartbreak could be avoided if some meas- 
ures of ability to attain success could be estab- 
lished for young people who wish to act in the 
professional field. Year after year thousands 
of young hopefuls flood the offices of the 
theatrical agents of Broadway and Hollywood 
determined to make a name for themselves. 
In many cases their decisions have been based 
on the fact that someone said that they were 
the possessors of pretty or handsome faces. 

At the present time there is no standard for 
measuring or predicting the success a young, 
would-be actor hopes to attain. However the 
purpose of this experiment is not to establish 
such an over-all criterion but rather to deter- 
mine some descriptive elements of the person- 
alities of the groups studied. 

Personality was selected as the basis of 
measurement for more than the reason of ex- 
pediency. It was felt that personality as well 
as talent was one of the basic factors for at- 
taining a certain amount of success in the field 
of acting. Discussion with the Broadway ac- 


tors who were used as subjects in the study 
seemed to verify this fact. 


It is hoped that the si 
contained in the followi 
used to understand furt 
position of the young peo 
for themselves the difficul 


gnificant knowledge 
ng material will be 
her the bewildering 
ple who have chosen 
t art of acting. 


Problem 

The present e: 
order to answer t 
A. Do students i 


* The authors i - 
of Me Robes preis their thanks for the assistance 
eatre and Acad : 
dent of Actors oe ge Clarence Derwent, Presi- 
rector of Drai 


J 3 Pr wyer Falk. = 
ae taglie Activities at Syracuse NER nk 
milage b S: gene Perlman, assistant at Hofstra 


and 


Herman D. Goldberg 
Hofstra College 


the Civic University Theatre, have a pattern 
of personality traits similar to that of profes- 
sional actors? 

B. Do students in the School of Speech and 
Dramatic Art, Syracuse University, who ex- 
press the desire to become professional actors 
but for various reasons do not appear in the 
major productions at the Civic University 
Theatre have a pattern of personality traits 
similar to professional actors? 

C. Do students in the School of Speech and 
Dramatic Art, Syracuse University, who ex- 
press the desire to become professional actors 
and who appear in the major productions at 
the Civic University Theatre have a pattern 
of personality traits closer to that of profes- 
sional actors than do students in the School 
of Speech and Dramatic Art, Syracuse Uni- 
versity, who express the desire to become pro- 
fessional actors but for various reasons do not 


appear in the major productions at the Civic 
University Theatre? 


The Experimental Situation 


Subjects. The following three groups were 
tested: (a) A total of 74 professional acon 
with a minimum of five years professiona 
experience; (b) 30 students of the School © 
Speech and Dramatic Art who appeared in me 
University productions; and (c) 100 students 
of the School of Speech and Dramatic Art wb° 
did not appear in University productions. À 

The Work of the Experimenter. The eX 
perimenter presented the questionnaires to the 
subjects; insured no communication betwee? 
subjects once the examination period began, 


and answered only questions concerning SP?” 
cific items, 


The Material, 
naires were used di 
periment, 
Inventory 
the factors: 
sion, (T) T 
(D) 


Two personality question- 
uring the course of this ¢*~ 
These were J. P. Guilford’s A” 
of Factors STDCR which teste 

(S) Social Introversion-extravel™ 
hinking Introversion-extraversiO™ 
“Pression, (C) Cycloid Dispositio”: 


A Personality Study of Professional and Student Actors -95 


Table 1 


Showing the Variances for: (1) Professional Actors, 
(2) Student Actors Who Appeared in University Pro- 
ductions, and (3) Student Actors Who Did Not Appear 
in University Productions on Factors S T DCR 
and O Ag Co of the Inventories 


Group I Group I Group IIT 
S 79.2 64.7 76.8 
T 84.3 138.7 100.9 
D 141.4 136.9 125.6 
c 136.6 136.9 134.7 
R 132.7 149.5 96.4 
o 184.2 184.2 200.0 
Ag 108.0 75.7 120.0 
Co 190.2 226.7 345.5 
Table 2 


Showing “F” and “t” Values for the Differences of 
Variances and Means Among the 
Three Groups 


Groups I Groups I Groups y 
~ and i and MI and II 
po a «py ey” apy “P 
r o 2g 11 25 12 9 
D i ge iz 30e 14 1.1 
c 10 20 ii g M a 
R 10 13 i0 1 10 5 
o 10 16 44 dgw 16 Ta 
10 16 si 2 li M 
S 5 16 8 
Co 144 is 8 6 38 
12 5 1.8** 1.3 1.5 


mena 
ora Significant at the 5 per cent level. 

ignificant at the 1 per cent evel. 
Guilford-Martin 


h tested the fac- 
ableness, 


be Rhathymia; and The 

nee, Inventory whic 

(ca, (0) Objectivity, (Ag) Asree 
4 Cooperativeness. 

e Data and Their Analyses: tical tests 


we poses of this study two statis F test to 
K used to analyze the data: the sar 
the differences in variances an Tahies 
fag the differences in means: 
2 present these findings- 
Conclusions 
ed at 


= e following conclusion 
an analysis of the data: pis 
” the pve Cycloid Disposition, Obj 


tivity, Agreeableness, Cooperativeness, there 
was no difference in degree of personality trait 
between the professional actors and student 
actors who did not appear in university pro- 
ductions. 

Professional actors, however, are signifi- 
cantly more shy, seclusive, and have a greater 
tendency to withdraw from social contacts 
than do student actors who do not appear in 
university productions (Factor S). 

It may also be said that student actors who 
do not appear in university productions have 
significantly more of the tendencies to seek 
social contacts and enjoy the company of 
others (Factor S). 

Professional actors are significantly more 
inclined to meditative thinking, philosophiz- 
ing, analyzing one’s self and others than are 
student actors who do not appear in university 
productions. These student actors tend sig- 
nificantly toward an extravertive orientation 
of the thinking processes (Factor T): 

Professional actors show significantly more 
signs of depression than do students who do 
not appear in university productions. Fur- 
thermore, student actors who do not appear 
in university productions are significantly 
more cheerful and optimistic than professional 
actors (Factor D). 

Professional actors are significantly more 
inhibited, over-controlled, conscientious and 
serious-minded than student actors who do not 
appear in university productions (Factor R). 

Student actors who do appear in university 

roductions and student actors who do not 
exhibit about the same degree of personality 
in all eight traits measured. 

Professional actors and student actors who 
appear in university productions exhibit about 
the same degree of personality in traits of 
Cycloid Disposition, Rhathymia, Objectivity, 
Agreeableness and Cooperativeness. 

The student actors who did appear in uni- 
versity productions were like professional ac- 
tors on six and significantly different on only 
two of the eight traits measured. However 
student actors who did not appear in univer- 
sity productions were like professional actors 
on four and significantly different on four of 
the eight traits measured. 


Received March 10, 1952. 


j PSY- Research | 
sa AANG GOLLESE |! 


|B { 


Tue JOURNAL or APPLIED Psycuorocy 
Vol. 37, No. 1, 1953 


Factors Influencing Reliability and Validity of Leaderless 
Group Discussion Assessment * 


Bernard M. Bass, Stanley Klubeck and Cecil R. Wurster 


Louisiana State University 


A substantial body of evidence is available 
to suggest that behavior in the initially leader- 
less group discussion is indicative of leadership 
potential for a fairly wide range of situations 
(1, 2, 4, 5, 6, 8, 10, 11, 12, 13, 14), However, 
the leaderless discussion technique has one 
obvious handicap, at least. 
candidates can be Placed in as 
many discussions are likely t 
resentative samples of the po 
sessed. Some discussions 
persons high in leadership 


Since very few 
ingle discussion, 
© contain unrep- 
pulation being as- 
may contain only 


potential; others 
may contain only persons low in such 


poten- 
tial. 

Since it has been shown that a Person’s 
leaderless discussion rating will be lower, the 
more candidates he must compete with in a 
given discussion (3), it seems reasonable to 
hypothesize that a candidate’s ratings will de- 


ent on the “quality” as 
those he is facing. The 
leadership potential and the effectiveness of 


Iso, the more reliabili 
technique were found 


1 This study was aided b 
a 
ana State University Co a oe arom 


the Louisi 
isi- 
uncil on Researc 5 


26 


to discussion, the more. would these allow- 
ances be necessary. The purpose of this in- 
vestigation was to determine the extent to 
which the reliability and validity of the LGD 
varied from discussion to discussion and the 
extent to which these variations could be ac- 
counted for by other known LGD variables 


such as the quantity and “quality” of par- 
ticipants, 


Method 


The investigators had available the assess- 
ment and criterion data from LGD validation 
Studies by Wurster and Bass (14) based on 
14 discussions among fraternity pledges; by 
Doll (6) based on 20 discussions among 
Sorority members and by Bass and Coates 
(2) based on 24 discussions among Army 
cadets and 12 discussions among Air Force 
cadets, F 

For each of the 67 discussions the follow 
indices were computed: rea = the validity oF 
LGD observers’ ratings as indicated by thei! 
correlation with a criterion of peers’ or supe 
riors’ appraisals of the participants’ leader- 
ship potential; ? rų = the reliability of ti 
two LGD observers ratings as estimated by 
the correlation between them; My and SDa = 
the mean and standard deviation of assigne 
LGD ratings; M, and SD, = the mean an 
Standard deviation of participants’ criterion 
ratings; K = the number of participants in a 
designated discussion; and E = the extent E 
which a designated discussion attained its ohi 
jectives as rated by both observers on a a 
Point scale. The Corrected split-half reliabl 
ities of both observers’ ratings of this meas” 
ure of group effectiveness for the 4 studies 
respectively were 74, 75, .78 and 85. 


2 Ae es e 
the For a discussion of how these ratings were mad? 
14 reader is referred to the original studies (2: 


) 


Factors Influencing Reliability of Leaderless Group Discussion 27 
Table 1 
Number of Groups, Means and Standard Deviations of rea, rœ, Ma, SDa, Mc, SDe, E, and K 
Means 

Subjects No. Groups Ted Txx’ Ma SDa Me SD. E K 

Fraternity pledges (14) 14 36: 85 S07) 149). 407 Jaso Sip p2 

Sorority members (6) 20 a7 850 (29.08 11.08 302 «8.7 S6. 70 

Army cadets (2) 21 20 Siaa 171 D2 nse sA Ti 

Air Force cadets (2) 12 MF 88. 98h 172 393 15.7 SA 68 

Weighted Average* ar ete sis ag 

Standard Deviations 

Fraternity pledges (14) 14 43 09 7.0 3.8 6.9 3.0 1.6 .67 
rority members (6) 20 46 -10 34e 2.68 43 25 17 -00 
Army cadets (2) A Tg a gA 2.5 17+ <'6 83 
Air Force cadets (2) 12 39 S o 6.3 3.6 11.4 6.0 1.9 .83 
{ POR Ves — o 3) eae aS 


Veighted Average* 


; 7 5 7 
* Only computed for variables whose scales remained the same for all four samples of subjects. 


* Used 7-item rather than 9-item rating scale. 


Variations in Reliability and Validity 


Table 1 shows the means and standard de- 
Viations of the above measures study-by- 
“tudy, Of special interest to the investigators 
ere the following conclusions inferred from 

€ results reported in Table 1. 
in 1. The reliability of LGD observers n 
a appeared quite stable as judged from 

andard deviation of these reliabilities of 12. 

ging from the average mean reliability of 

> it appears that this measure suffered little 
co en obtained discussion-by-discus5105 je 

MParison to previous studies where it W 


obtain 
ed from pooled data. 27 
W he average validity of es enero 
as less than obtained for the same A djes by 
ae ysis was performed in previous 5 iy ested 
poing many discussions. aa eo 
Wha, e validity of the LGD would be 
su “n use was made of any discussion 
fact as comparison rankings among C! 
bet wciPants that did not make Bes candi- 
qa, eeM-discussion variations ™ 
ates 


3. Th d 
va, he average standard ¢ 
Waity coeficients of .43 indicated t 
Ny. tremendous variability 1? Eod 
Vari discussion to discussion of +1: 
~j tion as the correlation ae it w 
duc O would allow. Therion 


“ed that any factors which cou 


on of the 


viati 
de hat there 


to correlate with the validity coefficient, even 
if the correlations were quite low, could ac- 
count for a wide variation in validity. 


Correlational Analysis 


Pearson product-moment intercorrelations 
were computed between the various group 
measures. Each intercorrelation was com- 
puted separately for each of the 4 samples be- 
cause of the variations in rating scales used 
from study to study. The correlations were 
transformed by means of Fisher’s Z conver- 
sion and then averaged. 

Several comments concerning this matrix 
shown in Table 2 appear pertinent: 

1. The significantly negative correlation of 
—.37 between group size and mean discussion 
ratings assigned corroborated similar results 
obtained by Bass and Norton (3), who varied 
group size systematically from 2 to 12. Thus, 
even where size differences were small and ac- 
cidental as in the present analysis, substantial 
yariations in LGD leadership ratings assigned 
were found associated with variations in the 
number of participants per discussion. 

2. The significant correlation of .46 þe- 
tween rated discussion effectiveness and mean 
LGD ratings assigned group-by-group sug- 
gested that the discussion observers had a con- 
sistent frame of reference in making these two 
ratings which transferred from one group to 


28 Bernard M. Bass, Stanley Klubeck and Cecil R. Wurster 


another, since, by definition, leadership ratings 
of individuals were supposed to depend on the 
degree to which they moved their group to- 
wards its goal while group effectiveness was 
defined as degree of goal attainment. : 

3. Possibly the most valuable finding of this 
analysis was the significant correlation of .35 
between the mean discussion rating assigned 
and the mean criterion status of discussion 
participants. This suggested strongly that 
absolute ratings of the discussion observers 
were accurately sensitive to group variations 
in outside leadership potential. It implied 
that there was a “between groups” positive 
correlation was well as a “within groups” posi- 
tive correlation between LGD ratings and 
outside appraisals of leadership potential. It 
was the validity due to “between groups” 
covariance which was lost when correlational 
analyses between test and criterion were run 
group-by-group rather than for an entire sam- 
ple. This probably accounted for the low 
mean validity of .27 based on group-by-group 
analyses reported in Table 1 in contrast to the 
validities of .40 and -50 reported when data 
are pooled. 

4. The positive but insignificant correlation 
of .20 between the group-by-group standard 
deviations of discussion ratings and the group- 
s of criterion rat- 
rver’s ratings were 
he variation in re- 
outside leadership 
8 the participants. Once again, 


tively related (r = —.28). This suggested 
that average and poor discussion participan 
were handicapped most severely when in com- 
petition with those participants of the e 
sample who earned extremely high LG: 
ratings. 

6. The reliability or extent of agreement E 
tween the two discussion observers appeari 
significantly related (r = .54) with the stand- 


ard deviation of the discussion ratings. How- 


ever, this correlation was an artifactual wi 
lationship, since by a simple transposition 0 
the formula for the standard deviation of w 
sum of correlated scores it can be shown tha 
SDa? — SD,2 — SD,? 
2SD;SD,, d 
refer to each of the 2 observers’ ratings an 
where x +x’ =d. , f 
7. A significantly negative correlation a 
—.32 was found between mean discussion ra 
ings assigned and the reliability of ratings: f- 
possible explanation for this correlation A 
fered by the observers was that they, the © 
servers, became more interested and absórh i 
in discussions with very good participan? 
while remaining more detached when parec 
pants were poorer. This same hypothesis w 
used to account for the one highly significan 
curvilinear relationship which was found bs 
exist among the variables. For each of oe 
four samples respectively, etas of .54, .78, tle 
and .68 were found between the rated enen 
tiveness of the group discussion and the exteri 
to which the observers agreed on the leader 


dx 
Top = where x and 7 


Table 2 


fe 
Mean Intercorrelations Among rea, rex‘, Ma, SDa Ma 
SD., E, and K (N = 67 Discussions)* 


Tol Tex Ma SDa M, SD, E K 
4 
Tea 07 09 24 28 29 16 E 
Txt —32 54 —09 03 .01 37 
Ma —.28 35 —.03 46 E 
SDa 06 20 12 2 
Me ‘2 1 -10 
SD. a al 
E : ol 
K 
* With 65 di p< .05 when r = 24; p < .01 when 
T= 31. Al o ; 


: AEn e 
relations Significant at and below t 


Š per cent level of confidence ar 


e in boldface type. 


Factors Influencing Reliability of Leaderless Group Discussion 


ship ratings they assigned. Agreement among 
observers reached a maximum in groups of 
average effectiveness or goal attainment while 
reliability of discussion ratings was low in 
both extremely effective and extremely ineffec- 
tive groups. 

8. The validity of the LGD was correlated 
at the 5 per cent level of confidence with three 
Variables, the group-by-group means of crite- 
rion ratings, as well as the group-by-grouP 
Standard deviations of discussion and criterion 
ratings, Since these three variables were all 
related to each other, it was difficult at this 
Point to determine which ones were uniquely 
related with LGD validity. 


Multiple Correlational Analyses 


It was decided to isolate the unique contri- 
an to the variance of the validity of each 
the other 7 variables of this investigation: 
is was done by determining the multip è 
anrtlation engen the validity of the LGD 
Da an optimally weighted sum of scores de- 
Wed from the other 7 variables. The Doo- 
ttle rather than the Wherry-Doolittle solu- 
ion was used to obtain the multiple R and the 
eae Weights since interest of the investigators 
otis focused on studying the effects of all t 4 
e er variables on validity rather than the et- 
abe of the smallest number of the other va 
pete which would yield the highest mali 
Correlation with LGD validity. The mi iat 
relation obtained was 43 indicating tha 
proximately 19 per cent of the qatian i 
ac validity from group-to-grouP Le = 
this uated for by those other varia H Fa 
b Is 19 Per cent, 6 per cent was accountes z 
Y the standard deviation of criterion aue 
pa cent, by the standard deviation of a 
3 ings; 4 per cent, by mean criterion a = 
Per cent, by group size; 4? per cent, 


the i Its suggest 
that? effectiveness. These resu g 

i i arl 
Su t The validity of a given discussion o se 
“red when there was a restriction in TA! 
articular han 


wiyetetion ratings. This P 
sig, Probably always remain 
lea ational test where a candi 


la to some extent depend up m 
å o j : th who 
Com i cage 

ha bination of candidat sessment. 


“Pens to be grouped for as 


29 


The effect on validity of criterion variations 
from discussion to discussion also suggests 
that increased effort must be directed toward 
training the observers to develop a standard 
frame of reference which transcends any 
given group discussion. 

2. The LGD may be expected to demon- 
strate greater discriminability among those 
higher on criteria of leadership potential than 
among those lower on such external criteria. 

3. LGD validity may be raised by increas- 
ing the standard deviations of discussion rat- 
ings. Aside from further training of the raters 
and more emphasis on forcing the raters to 
make greater discriminations, a number of 
ways may be suggested to increase the validity 
of the LGD. 

a. The length of discussion time may be 
lengthened from 30 minutes to an hour or 
more with the expectation that greater strati- 
fication in status may occur—although no evi- 
dence is available to support this contention. 

b. All the candidates can be coached briefly 
on how to be successful discussion leaders. 
Klubeck and Bass (9) have shown that while 
brief coaching raises significantly the LGD 
ratings of participants who are fairly success- 
ful initially without such training, such train- 
ing does not alter the LGD ratings of those 
who have been found initially unable to 
emerge as discussion leaders. This would sug- 

est that briefly coaching all participants 
would lead to a greater dispersion in the LGD 
behavior, although, of course, a long period of 
training might be expected to do otherwise. 

4. Size appeared negatively related to valid- 
ity. However, the relationship was too low to 
be anything but suggestive. Although groups 
varied only from 6 to 8 in size in these analy- 
ses, these variations accounted for 3 per cent 
of the variance. A study may be warranted 
of the relation between group size and validity 
similar to Bass and Norton’s (4) analyses of 
the relation between size and reliability. 

Similar Doolittle analyses were made to see 
the extent to which each of the other 7 vari- 
ables contributed to the variance of the reli- 
ability of discussion ratings from group to 
group and the extent each of the other 7 vari- 

tributed to the variations in efficiency 


ables con 
from group to group. The two obtained mul- 


30 


tiple correlations were .59 and .56 respec- 
tively; however little further knowledge was 
added to understanding of the relationships 
among the variables than had been found by 
inspection of the correlation matrix. 


Summary 


In order to determine the extent to which 
it was possible to account for variations in the 
reliability and validity of the leaderless group 
discussion, the means and standard deviations 
were computed for 8 variables along which 
LGD’s vary. Also, a mean intercorrelation 
matrix was computed among these 8 variables. 

The most important findings of this and re- 
lated analyses were that: 

1. The validity of the individual LGD 
varied greatly from discussion to discussion 
while the reliability of LGD ratings appeared 
quite stable. 

2. Absolute ratings of leadership perform- 
ance by LGD observers appeared accurately 
Sensitive to variations from discussion to dis- 
cussion in the outside leadership status of the 
participants. 

3. Discussion observers’ ratings agreed most 
closely when discussions were average in ef- 
fectiveness rather than extremely effective or 
ineffective, 

4. The validity of a given LGD was higher, 
the higher the outside leadership status of the 
participants in the discussion, the more strati- 
fied this status, and the more diverse the LGD 
ratings the observers were able to assign. 

: These results suggested a number of ways 
in which it might be possible to raise the 
validity of the LGD for assessing leadership 
potential. 


Received March 3, 1952. 


n 


10. 


11. 


12. 


13, 


14, 


Bernard M. Bass, Stanley Klubeck and Cecil R. Wurster 


References 


Arbous, A. G., and Maree, J. Contribution of 


two group discussion techniques to a validated 
test battery. Occup. Psychol., 1951, 25, 1-17. 


. Bass, B. M., and Coates, C. H. Forecasting offi- 


cer potential using the leaderless group dis- 
cussion, J, abn. soc. Psychol, 1952, 47; 
321-325. 


. Bass, B. M., and Norton, F. T. M. Group size 


and leaderless discussion. 
1951, 6, 397-400. 


J. appl. Psychol, 


. Bass, B. M., and White, O. Situational tests. II. 


Observers’ ratings of leaderless group a 
cussion participants as indicators of external 
leadership status. Educ, Psychol, Measmt. 
1951, 11, 355-361, 


- Carter, L., Haythorn, W. Meirowitz, B., and 


Lanzetta, J. The relation of categorization 
and ratings in the observation of group be- 
havior. Hum. Relat., 1951, 4, 239-253. i 

Doll, P. A. Validity of LGD assessment 2 
unacquainted women. Unpublished Masters 
thesis, Louisiana State University; Baton 
Rouge, 1952. 


. Guilford, J. P. Fundamental statistics in PSY 


chology and education. New York: McGraw- 
Hill, 1950. 


. Harris, H. The group approach to leadership- 


testing. London: Routledge and Kegan Pau 
1950, 


. t 
- Klubeck, S., and Bass, B. M. Differential effec 


of training on persons of different leadershiP 
status. Hum. Relat. (In press). é 
Landry, H. A., Krugman, M, and Wrightston 
J. W. Validation study of the group ne, 
interview test. New York: Board of Educ 
tion, 1951, T> 
Mandell, M. Validation of group oral perfori 
ance test. Personnel Psychol., 1950, 3, 17 
185. ake 
Taft, R. Some correlates of the ability to m A 
accurate social judgments, Ph.D. Dissertatio: 
U. of California: Berkeley, 1950. jce 
Vernon, P, E. The validation of Civil Serv ; 
Selection Board Procedures. Occup. Psycho 
1950, 24, 75-95. sual 
Wurster, C. R., and Bass, B. M. Situation. 
tests: IV. Validity of leaderless group hol- 
cussions among strangers. Educ. Psy¥¢ 
Measmt, (In press). 


| 


Tue Journat or Appiiep PsycHorocy 
Vol. 37, No. 1, 1953 


Validity of the Strong Vocational Interest Blank Nursing 
Key 


Leslie Navran 
San Francisco State College 


In a study done in 1947 (3), the Strong 
Vocational Interest Blank was administered to 
two groups of girls who were entering nursing 
training at Stanford University and San Jose 
State College. The Stanford group (N = 26) 
had a mean score of 43.8 points on the Strong 
nursing scale. The mean score of the San Jose 
State group (N = 44) was 40.1. 

Tn 1949, a follow-up revealed that 59 of the 

9 girls had completed the first two years of 
the three-year program. The mean nursing 
Scale score of this surviving group was 42.0. 
Ccording to the manual for the Strong 
lank (5), a score of 41 is the dividing line 
€tween the B and B-plus letter grades. 
. US, the girls entering the last year of train- 
‘ng (who were considered by school officials 
ES being almost certain to graduate) * had an 
erage nursing scale score equal to only the 
6th Percentile of the standardization group. 

Wenty-six of them had scores in the B, B- 

thus, and C range. These results have 1m- 
Portant implications with respect to the valid- 
tY of the nursing key and the use of the key 
n Vocational counseling. d that in the 


Tevious reports have indicate 
Past the nursing key has been more useful. 


n 1939, Hilgard (1) found that “those with 


au 
*tings on the Strong below 


wed little likelihood of C% 


Uses? traini ” 
* rain rse. 
Bitls i ing cou 


ange) 0 
re interested 


IS true, of course, that 


n 
Pur e succes 
thr Port to measur irports fo 


Ur °ugh school. Rather, it PY sho have con- 

tinale interests of women wple period of 
in 4 a conside 

nursing for of the 26 


1 2 2 
Rin) The Writer has been informed t ie succ 
Sat tanford completed their train! 


essfully- 
31 


time after completion of training. Neverthe- 
less, the Strong test “as been used to counsel 
students prior to their entrance into training, 
and the contrast between present-day and past 
results with the nursing key indicates that 
vocational counselors should now interpret B 
and C scores on the nursing key with caution 
because the predictive value of such scores 
may have lessened materially. 

In view of the small size of the samples used 
by Navran (3), it may be rash to state flatly 
that a revision of the nursing scale is needed. 
However, the following considerations lend 
support to the possibility that such is the case: 

It has been necessary in recent years to re- 
vise some of the scales on the men’s form of 
the Strong (2, 6), and where marked changes 
in the scale have resulted, they have been at- 
tributed to developments in the occupation it- 
self which made for changes in the composi- 
tion of the people engaged in the occupation. 
There is evidence that this may also be true 
of nursing. For one thing, partly as a func- 
tion of World War II and the current Korean 
conflict, there has been a serious shortage of 
nurses. The effect of this has been to recruit 
heavily for the profession, and this may be 
bringing girls into nursing who differ from 
the standardization group, but who nonethe- 
less can and will become nurses. This is an- 
other way of saying that nursing may be draw- 
ing from a wider segment of the general popu- 
lation in terms of measured interests than was 
formerly the case. 

Related to this is the discrepancy in age 
between nursing trainees and the standardi- 
zation group. Inspection of the Strong man- 
ual (5) reveals the standardization group to 
have been 34 years old, on the average, when 
tested in 1942. This means that girls pres- 
ently graduating from high school and enter- 
ing nursing training are approximately 15 
years younger than the standardization group 
with whom they are being compared. This 


32 Leslie Navran 


age difference may also be a factor making 
for different likes and dislikes in the present- 
day nursing trainees. 

Finally, and perhaps most importantly, 
nursing itself has become more complex and 
proliferated. There is an increasing differen- 
tiation being made between the practical nurse 
and the professional nurse. Also, specializa- 
tion in psychiatric nursing is growing more 
common, perhaps as a function of the growth 
and development of psychiatry and clinical 
psychology. It should be noted, too, that “the 
only revised scales which have differed ap- 
preciably from the old scales are the physician 
and psychologist scales.”? Since these pro- 
fessional people with whom nurses are in close 
association have changed so greatly, it may be 
reasonable to hypothesize that nurse trainees 
who can get along well with them may also be 
quite different from their older and successful 
fellow-nurses. This is speculation, of course, 
but in view of the results reported above, it 
makes the adducing of more data 


extremely 
pertinent. 
2 Personal communi 


al cation from the consulting edi- 
tors of this journal. 


Summary 


Evidence is presented which casts doubt on 
the validity of the present nursing key of the 
Strong Vocational Interest Blank. Factors 
which may account for this finding are dis- 
cussed. 


Received April 28, 1952. 


References 


1. Hilgard, Josephine R. Strong Vocational Interest 
scores and completion of training in a school 
of nursing. Psychol. Bull., 1939, 36, 646. 3 
2. Kriedt, P. H. Vocational interests of psychologists. 
J. appl. Psychol., 1949, 33, 482-488. 
- Navran, L. The Super-Roper technique as 4 
measure of interest in nursing. J. appl. PSY- 
chol., 1950, 34, 417-422. , 
4. Roper, Sylvia A. A test of interest in nursing: 
Unpublished Master’s thesis, Clark University: 
1940, 

5. Strong, E. K., Jr. Manual for Vocational Interest 
Blank for Women. Stanford, California: Stan- 
ford University Press, 1945, 

6. Strong, E. K., Jr. Vocational interests of 36° 


countants. J. appl. Psychol, 1949, 33, 474- 
481, 


THE JOURNAL or APPLIED Psy 
NA A ‘CHOLOGY 
Vol. 37, No. 1, 1953 


Individual Differences in Ability 


to Fake Vocational Interests 


Ralph Garry 
Boston University 


_ The purpose of this study was to investigate 
individual differences in ability to fake voca- 
tional interests, to determine if reliable in- 
dividual differences in faking ability existed, 
and if such differences could be related to 
vocational selection. Three separate trials 
were made, requesting college students, who 
had previously been administered the Strong 
Vocational Interest Blank under standard 
directions, to obtain as high a score as possible 
on certain of the occupational interest scales. 
Derived faking scores for the several scales 
were correlated to determine generality of 
ability to fake, and also the reliability of such 
faking, i 


Background 


The present study was init 
conjunction with the Medical Specialists Re- 
Search Project, a joint undertaking of Stan- 
ford University and the Surgeon General of 
the U. 5, Army, having for its purpose the de- 
velopment of test instruments designed to 
acilitate assignment and classification of doc- 
Si into residency training programs.” It 
Yas hoped that if reliable individual differ- 
ences existed, they could be used in distin- 
8uishing candidates specializing in psychiatry 
rom those in surgery, the assumption bee 
at psychiatrists would show greater insig t 
"to attitudes and interests: 

Ithough several attempts | 
the. evise measures of “socia © 
anid so-called tests of “social 1 
than social judgment” have been 

an crude measures of intelligence 0" | ral 
stratte other hand, the results of x = 
s “dies have suggested that the ability see 
“ores in predetermined ways on oe neat 

€ tests showed promise as 4 possib pi z 
in the area of social intelligence oF P 
Ogical insight (1, 4)- 


ially begun in 


ave been made 
1 intelligence,” 
ntelligence” 
little better 


Y 


re 
Cho] 
_ Humphreys 

study was 


33 


1 
Under” author is indebted tO 
€xecuteg Hoe supervision the 


Lloyd G 
present 


Strong (7, p. 685) reports that “testees can 
deliberately obtain high occupational interest 
scores when they try.” Benton and Korn- 
hauser (1), interested in the use of the Strong 
Interest Blank in selection of medical school 
students, asked a group of 34 undergraduate 
college students (mainly social science majors) 
to fake as high a score on the physician scale 
as possible. The results corroborate Strong’s 
finding that faking is possible. Of greater in- 
terest was an indication that all of the group 
did not gain, giving support to the premise 
that the ability to fake occurs in differing 
degrees. 

Of the few earlier studies, one of the most 
relevant to the present was Steinmetz’s (5). 
A total of 46 junior college students, directed 
to fake high scores on teacher-administrator 
scales on the Strong Blank, made significant 
gains over original scores. Intercorrelation 
between original score, faked score, intelli- 
gence and gains made in faking showed that 
both original and faked scores correlated with 
intelligence significantly greater than .00, but 
the difference between them was not statis- 
tically significant. The correlation between 
gain made and intelligence was significantly 
negative. Steinmetz infers from this negative 
correlation that intelligence makes little con- 
tribution to the obtained faking, apparently 
overlooking the extent to which the negative r 
is an artifact of method of determining gain; 
individuals with low initial scores have much 
greater possibilities for gains. 

The extent of the relationship between in- 
telligence and the ability to fake scores is 
critically important to a conclusion that fak- 
ing ability represents “social judgment.” An 
attempt to obtain a partial correlation coeffi- 
cient between these two variables holding the 
relationship of each with initial score constant 
produced coefficients in excess of 1.00, suggest- 
ing inaccuracies in the data as presented. 

In a recent study Jessen (3) found that 
parents’ responses on answering the Kuder 


34 Ralph Garry 


Preference Record and Bell Adjustment In- 
ventory as they thought their children would 
respond correlated .75 with child responses. 
These findings give rise to the question of how 
far such faking ability extends; that is, will a 
more general population show as high a de- 
gree of faking or is such faking ability limited 
to particular situations? 

The purpose of this study, therefore, is to 
determine the degree, extent and reliability of 
the ability to fake scores on the Strong Voca- 
tional Interest Blank for Men. 


Population 


A separate group was used on each of three 
trials. Group 1 consisted of 178 male, college 
undergraduates enrolled in a general psychol- 
ogy course. Groups 2 and 3 included 75 and 
91 students of both sexes enrolled in educa- 
tional Psychology courses. The latter two 
groups were more heterogeneous with respect 
to age and vocational experience. For pur- 
poses of the tables in this report only the data 
for Groups 2 and 3 are presented. 


Procedure 


1. The Strong Vocational Interest Blank for 
Men was administered using standard pro- 
cedure to a group of sufficient size to provide 
sub-groups (successful and unsuccessful at 
faking) of fair size. 

2. Biographical data were obtained for 
each subject usin 
tionnaire, 
asked for 


omy for us 
cient. 

3. A measure 
tained. The be: 
academic grade-poi 


Carpenter would; as a physic 
would. Although the results ae ee 


; i generally coin- 
cided with results of the second So third 


trials, it was evident that the carpenter scale 
had been a poor choice, apparently being too 
easily faked. The original mean score was 
—45, mean faked score was 115 with insuff- 
cient spread of scores to permit a test of dif- 
ferential faking ability. The low reliability of 
-39 (first versus last half corrected by Spear- 
man-Brown formula) confirmed the doubts re- 
garding the carpenter scale. 

5. In the repetitions of the experiment, four 
scales were used instead of two, obtaining fak- 
ing on one-half of each of the four scales an 
order to remain within reasonable time liin 
for testing. There are a sufficient number 0 
weighted items on each half of the Interest 
Blank to provide an adequate measure of fak- 
ing. 

The four scales chosen for the second ad- 
ministration were physician, minister, lawye! 
and president of manufacturing conco 
They were chosen because they had the lowes 
intercorrelations with physician scale, ade- 
quate reliability and, more important, berani 
they represented the interest factors shown f 
be present in the Strong Vocational Interes. 
Blank for Men in several factor analys 
studies (7, p. 314 f). i 

6. The reliability of the faking on the se 
ond and third trials was determined by Pre 
paring scoring keys for use with the IR 
test scoring machine which provided a re z 
ability coefficient based on odd versus we 
Tesponses. All items with plus weights wa 
given plus one weights, and all with min 
weights were given minus one weights. a 

7. In establishing a score for faking ability) 
it was apparent that a high score on the faker 
tests did not certify to high degree of fakiri 
ability, rather the ability to increase or 
original score was the measure of e 
ability. The problem was to obtain a Ee 
tively uncontaminated measure of gain. T 4 
high negative correlations obtained by se 
metz (5) when gain made was compared é 
original score nullifies gain made as a mess 
of faking ability, for its magnitude is a ma 
tion of initial standing. Two methods We 
tried with approximately equal results. ; 
first, a ratio of gain made to gain possible W? 
rejected because of the tendency of rat 
Scores to produce spurious correlations unde! 


Individual Differences in Ability to Fake Vocational Interests 


certain conditions. The faking score adopted 
was the difference between score obtained 
under faking directions and the score pre- 
dicted on the basis of the correlation between 
original and faked scores. This difference 
Tepresents a measure of faking ability, inde- 
Pendent of original score, which may be corre- 
lated with similar differences obtained on the 
other scales. 

Scatter diagrams were prepared to check the 
distribution of regressed faking scores * about 
the regression line for predicted scores for each 
of the four occupations. This was done as an 
empirical check of the assumption that the 
difference scores used in the preceding inter- 
Correlations were randomly distributed about 
the regression line, and independent of the size 
of initial score. The distributions observed 
Supported such an assumption. 

8. In order to establish any generality of 
faking ability, it was necessary to account for 
any variance in the correlation between faking 
Scores that is associated with intelligence, edu- 
Cation or experience of vocational or avoca- 
tional nature, Faking ability, if it exists as a 
Psychological characteristic of any generality, 
Should have some independence from the 
aforementioned factors. Biserial correlation 
Coefficients were computed using the regressed 
Scores on the physician-faking and president- 
aking because of their higher reliability and 

€ ease with which the groups could be dicho- 
tomized as non-informed or informed about 
t © occupation judging from such data ob- 
ained on biographical information blank. 


? The t “faki ”i d to designate the 

erm “fa ore” is use 4 

fae Tse pot the difference between ihe hie 
ing score and the predicted (regresse 


35 


Given such independence, the test for the 
presence of faking ability depended on low 
intercorrelations between initial scores, be- 
tween initial and faked scores, but high cor- 
relations between faked scores. This would 
show that the rank order obtained on faking 
differed from that on initial scales, either be- 
tween scales or within scales, thus indicating 
generality of faking ability. 

9. A final step in the treatment of the data 
was an item analysis of responses made on the 
faking of the physician scale, using upper and 
lower 27% of Group 1 and Groups 2 and 3 
combined. 


Results 


The data obtained in this study confirm the 
reports of previous investigators regarding the 
extent of faking that is possible on pencil- 
and-paper tests of personality and interest. 
Groups of individuals, given instructions to 
fake high scores on the Strong Vocational In- 
terest Blank for Men, are able to obtain sig- 
nificant increases in the group mean, although 
there are some indiviđuals at all score levels 
who do not gain. The faking apparently is 
not correlated with intelligence, sex, or in- 
formation about an occupation. The biserial 
correlation coefficients between faking score 
on physician scale and grade point average 
were —.18 and .22 (Groups 1 and 2); between 
faking score and sex were —.06 and —.02, and 
for faking score and information were .02 and 
00 (for president, manufacturing, and phy- 
sician scales with data for Groups 2 and 3 
combined). 

Table 1 shows that consistent gains were 
made in the means on all scales, with the 


SCore, 
Table 1 
-ations of Original and Faked Raw Scores: 
Means and sinai wE 75) and Group 3 (N = 91) 
Group 2 Group 3 
a o Mean SD 
ea A 
- RE Orig- Faked Orig. Faked Orig. Faked 
Scale Orig. a is 19 7 © 24 18 19 
President Ot Se 2 10 1 20 100 12 
awyer 2 é1 35. 36 ay 40 31 
hysician a A Bs A Pay, i65 2% 4 


inister 


36 


standard deviations remaining fairly constant, 
except for the lawyer scale. The smallest gain 
in the means, that made on the president scale 
for Group 1, is significant at greater than the 
-001 level of confidence. On the whole, the 
similarity of the means and standard devia- 
tions indicates the comparability of the 
groups. This does not hold for the lawyer 
scale. The decrease of 10 raw score points 
on the standard deviation is significant above 
the .001 level of confidence. The most rea- 
sonable explanation for the decrease, and 
similarly that found with the carpenter scale 
on the first trial, is the low reliability of the 
faking scores. 

It is possible that some scales are more 
easily faked by all members of a group, result- 
ing in decreased variability under faking con- 
ditions. However, it should be noted that the 
observed decrease in variability is not asso- 
ciated with the scale’s having a higher propor- 
tion of easily faked items, assuming the num- 
ber of such items to be proportional to the 
number of large scoring weights. (It was ob- 
Served in item analysis of faking that all mem- 
bers of the group choose the correct response 
for interests that are obviously related to a 
given occupation.) Under such circumstances 
the number of items upon which faking differ- 
ences could obtain would be proportionately 
smaller, resulting in decreases in standard de- 
Viation. Both minister and physician scale 


have a greater proportion of large scoring 
weights than lawyer scale. 

Reliability Coefficients for each set of raw 
faking scores are 


l presented in Table 2, al 
with estimates of the re 


T Raw Faking Regressed Score 
core (estimated) 
President -87 7 
Lawyer 55 = 
Physician -89 $o 
Minister -18 rH 


Ralph Garry 


Table 3 


Intercorrelation of Faking Scales for Groups 2 and 3 


Group Group Mean 
Scales Correlated 2 3 r 
President, mfg. concern, 
and lawyer .28 2i 28 
President, mfg. concern, 
and physician 34 35 34 
President, mfg. concern, 4 
and minister 22 —.11 05 
Lawyer and physician —.05 16 406 
Lawyer and minister .10 16 3 
Physician and minister .26 —.02 12 


tion of the lawyer scale, the reliability of the 
faking scores is comparable to that reported 
by Strong (7) for scales administered undet 
standard conditions. 

Table 3 presents the correlations betwee? 
the faking scores on the various scales. These 
indicate that there is no marked general faking 
ability; evidence, all 7’s are under .36. The 
fact that nearly all 7’s are positive, however, 
indicates the Possibility of weak general faking 
ability, which could be proved or disproved i” 
a subsequent trial by using scoring keys base 
on item analysis of Tesponses of upper am 
lower faking groups. If true, one would €% 
pect increased correlations, using such keys: 

The correlation between two faking scores 
is independent of the initial correlation bee 
tween scales (as reported by Strong) judging 
from second order partial correlation coeff- 
cients computed between minister and pe 
dent scales, which changed negligibly from 
the given .00 r between original and faking 
scores. j 

Items analysis of faking responses to phys” 
cian scale using top and bottom 2790 “4 
8roups indicates that the differences obtain? 
result from less than half of the weighte 
items. Both groups choose the obvious "° 
sponses of physicians. Successful faking = 
dependent on Predicting the more subtle di 5 
ferences in interests. Significant differenc® 
are obtained on as many unweighted ® 
weighted items, suggesting considerable ina 
curacy in faking. However, the differenc®® 
obtained do not result from a willingness ° 


a i ee 


Individual Differences in Ability to Fake Vocational Interests 37 


the successful faking group to commit them- 
selves to a like or dislike response while the 
non-fakers remained neutral. 


Summary 


Two groups of 75 and 91 college students, 
instructed to fake high scores on four scales 
of the Strong Interest Blank after taking it 
under standard directions, demonstrated: 

1. Significant increases in mean scores on 
all scales. 

2. Split-half reliability of faking ranging 
from .56 to .89, with three scales exceeding 
.75, indicating a high degree of consistency. 

3. Intercorrelations between faking scores 
(the difference between obtained and regressed 
fake score) ranging from —.05 to .35, suggest- 
ing a low degree of general faking ability in- 
Volved, with most faking being specific to the 
given scale. 

4. Faking ability was not correlated (bi- 
Serial 7) with intelligence, sex, OF information 
regarding the occupation. 

_ Š. The more successful in faking (in an 
item analysis) predict substantially more of 


the subtle occupational interests, whereas all 
predict the obvious. 


Received April 21, 1952. 


References 


1, Benton, A. L., and Kornhauser, S. I. A study of 
“score-faking” on a medical interest test. J. 
Ass, Amer. Med. Coll., 1948, 23, 57-60. 

. Broom, M. E. A further study of the validity of 
a test of social intelligence. J. educ. Res., 
1930, 33, 403-405. 

3. Jessen, Margaret. Parent-child cooperation in the 
counseling process. Unpublished Ph.D. dis- 
sertation, School of Education, Stanford Uni- 
versity, 1950. 

4. Kelly, E. L., Miles, Catherine, and Terman, L. M. 
Ability to influence one’s scores on a pencil- 
and-paper test of personality. Char. and Per- 
sonality, 1936, 4, 206-215. 

5. Steinmetz, J. C. Measuring ability to fake oc- 
cupational interest. J. appl. Psychol., 1932, 
16, 123-130. 

6. Strang, Ruth. 
certain other factors. 
268-272. 

7. Strong, E. K. Vocational interests of men and 
women. Stanford, Calif.: Stanford Univ. 
Press, 1943, xxix, pp. 746. 

8. Thorndike, R. L. Factor analysis of social and 
abstract intelligence. J. educ. Psychol., 1936, 
27, 231-233. 


w 


Relation of social intelligence to 
Sch. Soc., 1930, 32, 


THE JOURNAL OF APPLIED PsycHOLocy 
Vol. 37, No. 1, 1953 


The Reliability of Self-Ratings as a Function of the Amount 
of Verbal Anchoring and of the Number of Categories 
on the Scale 


A. W. Bendig 
University of Pittsburgh 


One of the first problems faced by the con- 
structor of a rating scale is the paucity of ex- 
perimental literature on the optimal character- 
istics of such scales. Information is needed, 
for example, as to the effect of variations in 
the number of scale categories and in the 
amount of verbal definition or anchoring of 
the scale categories upon both the reliability 
and validity of the scales. The scale should 
not be so coarse as to lose some of the dis- 
criminative ability of the rater, nor so fine that 
error variance is added to the ratings because 
the scale categories call for finer judgments 
than the rater is capable of making. As to 
anchoring, presumably the more defining of 
scale categories and the more objective are 
such definitions the greater will be inter-rater 
measures of reliability. However, in self-rat- 
ings, such as are common 
Studies (3), 
definition of s 
undesirable 1 


e measured vari- 


Champney and Marshall 
monds’ analysis, 


graphic ratings using a millimeter scale a 
also using a coarser centimeter scale. a 
correlation between two forms of the scale (8 
families rated twice) was significantly bigbe 
for the millimeter scale when compared ite 
the centimeter scale (0.77 compared bal 
0.67). Such a magnitude of increase is m 
greater than could be predicted for Symond 7 
analysis. Bendig and Hughes (2) found that 
an information analysis of rating scales aita 
ing in number of categories indicates that : S 
absolute amount of information transmitte 
by the scale increased with increasing number® 
of categories, but that the increments beca™ 
smaller with longer scales, Ww 
The purpose of the study reported belt 
Was to investigate the effect of variations Í 
the number of scale categories and amount i- 
verbal anchoring upon inter-judge (1) re” 


W ; ESE 
ability estimates of self-ratings of individual 
and of groups. 


Procedure 


Scales. Fifteen different forms of a ne 
cal rating scale were constructed from ie 
combinations of five different numbers of oe 
categories (3, 5, 7, 9, or 11) and three cont 5 
tions of verbal anchoring of the categories 
(center category defined, both end catego 
defined, or center and end categories define A 
The lowest category on each scale was give? 
numerical value of 1 , the highest catego’ 
Was rated as 3, 5, 7, 9, or 11, with intermedia 
Scale categories numbered accordingly. te 

Subjects, The Ss were 225 undergradué 


students in introductory and social psycholoe 
classes, The fifteen scales were randomly di 
tributed amo 


. n 
l ng the subjects with 15 rate 
using each of the scales. 

nstructions, 


Each 
graphed on a sin 


. 0° 
scale was mimê 


a6 gle page containing the sti 


The Reliability of Self-Ratings 39 


Table 1 


Analysis of Variance of Group and Individual Reliability Coefficients of Rating Scales Differing in the 
Number of Scale Categories and Amount of Verbal Anchoring of the Scale 


Group Reliability 


Individual Reliability 


Sum of Mean Sum of Mean 
Source of Variation df Squares Square Squares Square F 
Total 44 4049.24 5105.64 
Number of categories z 91.91 22.98 os 150.53 37.63 ie 
Amount of Avelotihe 2 274.17 137.08 1.42 455.24 227.62 2.27 
Thteraeuas 8 793.91 99.24 1.03 1496.97 187.12 1.87 
30 2889.25 96.31 3002.90 100.10 


Within groups 


uli to be rated and instructions to the rater. 
he stimuli were the names of twelve foreign 
nations, ranging from well-known countries 
such as France and Canada to lesser known 
nations like Sweden and Egypt. The Ss were 
asked to rate themselves on how much they 
new about the political, economic, ge0- 
8taphic, and sociological characteristics of 
ĉach country. Emphasis in the instructions 
Was placed upon the Ss rating their own in- 
ormation about each country and the three 
Verbal statements used to anchor scale cate- 
&ories were: 


T know a great deal about this country. 
I know something about this country. 
know very little about this country. 


Results 


, Each group of 15 raters using one 
different scales was randomly subdiv 
ree grou taining 5 ; 
estimate of the reliability of UP ang 
rach subgroup was computed using a 
nique developed by Hoyt (6) and 3 fatings 
°stimate of the reliability of individua TP de- 
aS found using the intraclass 5246) and 
ctibed by Snedecor (8, PP- ci Hoyt 
“laborated upon by Ebel (5). F testion 
Tocedure is designed to answer ae of Ave 
he reliability of the tal task 
F €rs on the above describe i 
aha, the Snedecor method @" came task- 
llity of a single rater 0 th coefficients 
* resulting 45 group reliability 
tar analyzed within the ae 
ny ‘al design for the effect 
Mber of scale categories: 


anchoring, and the interaction of these two 
variables. A similar analysis of variance was 
computed on the 45 individual reliability co- 
efficients. The results of these two analyses 
can be found in Table 1. It can be seen that 
neither of the two main variables contributed 
significantly to the total variability of either 
the group or the individual reliability coeffi- 
cients. Also, in neither case was the interac- 
tion term significant when tested against the 
within-groups (error) mean square. 

The three subgroups using each of the fif- 
teen scales were pooled and new Hoyt and 
Snedecor reliability estimates computed. In 
this analysis each of the estimates is based 
upon the ratings of 15 subjects. Since the in- 
teraction of the main variables was insignifi- 
cant, the three anchor groups were further 
pooled and group and individual reliability co- 
efficients computed for each of number-of- 
scale-categories groups. These estimates are 


Table 2 


Average Group and Individual Reliability Coefficients 
(Decimal Points Omitted) for Each Number 
of Categories on Rating Scales 


Number of Scale 
Categories 


Type of of Se 
Reliability Raters 3 5 7 Oo fi 


5 68 68 67 69 65 

Group 15 89 88 87 89 g4 
45 96 95 96 96 99 

5 28 31 33 33 29 

Individual 15 34 34 32 35 97 
45 33 32 33 35 fe 


40 A. W. Bendig 


Table 3 


ivi iabili fficients 
A Group and Individual Reliability Coe! 
Oe rapar Omitted) for Each Amount 
of Verbal Anchoring of Rating Scales 


Amount of Anchoring 

Number Se . 

Type of of enter 

Reliability Raters Center End and End 
Group 5 T 6 n 
15 86 87 89 
Individual 5 29 28 35 
15 29 31 36 


based upon 45 raters in each group. The 
average group and individual reliabilities for 
each of the category groups can be found in 
Table 2. In general, both the group and in- 
dividual reliabilities were constant when 3, 5, 
7, or 9 scale categories were used. However, 
in all instances the reliability declined some- 
what when 11 scale categories were used and 
this decrease in reliability becomes more evi- 
dent as the number of raters increases, 
Similar average group and individual reli- 
abilities for the three anchor groups are given 
in Table 3. Increased reliability can be noted 
with increased amounts of anchoring with the 
greatest increase occurring between the group 


with both ends anchored and the group with 
center and end anchoring. 


Discussion 


al self-ratings 
n the number 
limits of from 


e poi 
ney and Marshall (4) Peg! a, 
beyond the discriminative ability 
adds error variance to th 
in this instance, 


roblem that is sl 
i difficult and the reliability of his ere 
egins to decrease, While Champney ed 


Marshall found increased reliability with 
creased refinement of the scale, we have his 
opposite results. An explanation of this si 
ference probably lies in the type of rating = 
presented to the subject. Champney — 
Marshall had their subjects rate the observe 
behavior of others: we had our subjects rate 
their own introspections. Obviously we hee 
not generalize our results to ratings of snd 
tive behavior, but must limit ourselves to th 
behavior herein investigated. a 
As to anchoring of the scale, increased ver 
bal definition of the categories resulted e 
slightly increased reliability. The important 
anchor seemed to be that defining the copi 
category. There was only a slight aiiora 
between the groups that had only the gn 
category defined when compared with t E 
group having only the two end catego 
anchored, but the addition of a center anche! 
to the latter scale appreciably raised its re A 
ability. The lack of interaction between num 
ber of categories and amount of anchoring 
may be attributable to the fact that the pied 
gories added to the three-category scale wre 
unanchored categories inserted between oa 
center and end categories. Possibly ec 
scales, each of whose categories was verba i 
anchored, might not have shown a drop in Fe 
ability between 9 and 11 scale points. g 
Synthesizing: these results with those p 
viously reported (2) we recommend that wa 
constructing self-rating scales 9 categor 
should be used, since: (a) they are as relia a 
as shorter scales; and (b) they provide apt 
information. However, adding addition 
categories provides some increase in informi 
tion at the sacrifice of scale reliability- oe 4 
further concluded that more verbal anchor ity 
of the scale will increase both the reliabili j 
and the information transmitted by the sc4 


Summary 


A total of 22 


m 
i 5 college students rated thé š 
se 


ves as to how much they knew about twe 
foreign countries. The rating scales differ ) 
in number of scale categories (3,5, 7,9, -je 
and in amount of verbal anchoring of the s€? 


pojtits (center category defined, end categori?’ 
rae or both center and end defined). ing? 
reliabilities of individual and of group rati” 


The Reliability of Selj-Ratings 41 


for each scale were computed by intraclass 
methods. Results indicated equal reliability 
for scales having 3, 5, 7, or 9 categories, but 
a decrease in reliability for 11 categories. The 
reliability of the scales increased with added 
scale anchoring. The discussion emphasizes 
that the results can be generalized only to self- 
ratings and not to ratings of observed be- 
havior, 


Received May 9, 1952. 


References 


judge relia- 


1. Bendig, A. W. Inter-judge vs. intra- 
Amer. 


bility in the order-of-merit method. 
J. Psychol., 1952, 65, 84-88. 
2. Bendig, A, W.. and Hughes, J. B, II. The effect 
of the amount of verbal anchoring and number 


. Snedecor, G. W. Statistical methods. 


of rating scale categories upon transmitted 
information. (In preparation). 


. Cattell, R. B. The description of personality: 


principles and findings in a factor analysis. 
Amer, J. Psychol., 1945, 58, 69-90. 


. Champney, H., and Marshall, Helen. Rater’s min- 


jmal discrimination as a criterion for determin- 
ing the optimal refinement of a rating scale. 
J. appl. Psychol., 1939, 23, 323-331. 


. Ebel, R. L. Estimation of the reliability of rat- 


ings. Psychometrika, 1951, 16, 407-424. 


. Hoyt, C. Test reliability obtained by analysis of 


variance. Psychometrika, 1941, 6, 153-160, 


. Kelley, T. L. Statistical method. New York: 


Macmillan, 1923. 
(4th ed.) 
Ames, Iowa: Iowa State College Press, 1946, 


. Symonds, P. M. On the loss of reliability in 


ratings due to coarseness of the scale. J. exp. 
Psychol., 1924, 7, 456-461. 


Tue JOURNAL or APPLIED PSYCHOLOGY 
Vol. 37, No. 1, 1953 


An Analysis of Engineering Entrance Examinations 


Harry W. Case 


Department of Engineering, University of California, Los Angeles 


The problem of measuring engineering 
achievement and aptitude is certainly not new. 
Indeed, this is an area of investigation that 
has been under study for the last twenty years. 
Today if the studies related to this field are 
compiled, they add up to an impressive num- 
ber. In appraising engineering aptitude the 
exploratory investigations have ranged from 
determining the interrelationship existing be- 
tween the scores of general capacity tests and 
measures of success in an engineering curricu- 
lum to attempts to tease out the specific fac- 
tors making for success in engineering. For 
example, interrelationships as high as .62 (5) 
have been obtained between success in the first 
and second semesters and the American Coun- 
cil Psychological Examination (a general ca- 
pacity measure). 


University 
a study of 
udent suc- 
since 1945, 


Subjects 


Although the investigation of the factors 
that make for success in this engineering col- 
lege has been underway for a number of years, 
many of the students who have entered and 
proceeded either to graduation or withdrawal 
could not be used as subjects in this study: 
The elimination of numerous cases,—such as 
those in which courses were repeated to raise 
a grade, or in which previous specialized and 
related military training existed, pre-entrance 
engineering extension division study had been 
taken, and other related and influencing eX 
traneous variables—greatly reduced the tota 
number of cases available for study. From 4 
total of well over a thousand potential sub- 
jects the actual correlations were obtained fo" 
N’s which ranged from 144 to 444, Eve? 
though the reduction in the total number © 
subjects available for the measurement of the 
various interrelationships is regrettable, it ‘f 
believed that matching the subjects in terms ° 
Previous training more than offsets the loss ° 
mass data. X 

It is probably somewhat unfortunate that at 
the majority of studies published in the 14 
ten years no mention has been made as t° 
whether the subjects have been matched I 
terms of previous training and preparatio™ 
although it is recognized that many studen 
Who enter as freshmen have had prior coleg? 
or military training which may influence the” 
Success in the first two academic years. 


Procedure 


All incoming freshmen were given the com 
plete Pre-Engineering Inventory prior to ©% 
trance. This is a special abilities test batte 
using the “task-simulation” technique and W% 
developed as a joint project of the Enginee" 


hen n Professional Development, H 
an Societ: i A tiot 
and the Ca ety of Engineering Educa 


rnegie F + ance” 
ment of T gie Foundation for the Adv 


a eaching. It consists of the seve 


An Analysis of Engineering Entrance Examinations 


43 


Table 1 


Tetrachoric Intercorrelations* of P.E.I., Jr. Status Examinations, Certain Subject Areas, and Semester Grades 


4 £ a4 g 8 vE g 8 mo ate I ae 
AEHL TTT iiges 6 wk s<ce 
et, gee SEER SEER OC Fe Bae ag 
SESER ESSA Sagaga 2G tC SESS 
ee ASN oS ey Gi on on tn it eS ee S Sie es os 
HRB REBAR AA eke ee we wee A 5 # SB 8 SS 
eee ea ae oe BB BO Ae A RR SN om) Se 
23 £ S$ 67: 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 

1 @ 67 30 28 17 64 64 72 75 06 26 23 34 43 36 19 30 23 02 03 0L 23 31 42 
2 67 60 45 25 50 86 80 35 30 48 22 42 56 41 26 31 02 02 08 24 44 39 56 
3 65 65 55 58 91 88 54 25 45 31 53 72 38 39 46 22 03 11 23 39 43 49 
4 62 41 40 85 71 35 19 46 47 44 48 40 29 26 16 44 31 20 41 46 46 
5 51 41 66 73 32 17 39 21 53 49 32 24 40 11 05 02 13 28 25 31 
6 21 47 55 30 46 37 25 25 50 26 09 07 28 20 39 18 20 23 23 
7 55 70 45 31 34 06 26 44 39 13 38 —10 —02 12 17 33 31 39 
8 92 49 30 57 21 60 69 50 40 44 24 12 04 32 54 54 54 
9 67 42 64 23 61 77 50 38 44 23 09 11 46 51 48 50 
10 29 37 33 35 6l 46 41 28 23 27 20 04 28 30 33 
1 38 11 34 41 29 22 20 4l 28 33 28 31 32 35 
12 32 49 77 65 40 49 15 27 31 39 49 53 48 
3 5] 61 38 47 34 13 37 35 14 29 37 37 
F 73 52 30 44 09 39 32 32 49 51 54 
i 56 46 52 24 24 23 31 55 74 65 
71 80 43 47 38 15 63 70 66 

16 86 49 26 38 54 72 83 85 
17 53 19 28 59 74 85 90 
18 25 22 20 22 45 45 
“4 78 20 32 32 33 
P 42 43 42 39 
z 6l 55 45 
A 62 63 
24 iS 

level. 


* s in boldface are significant at the 1% 


following named tests: General wae Seale 
Technical Verbal Ability, Comp ae 
cientific Materials, General Matie prin- 
\bility, Comprehension of Mecer Tae 
ciples, Spatial Visualizing Ability, 4M ination 
Standing of Modern Society. nanea test in- 
Of the material of each section of the ae 
icates that its face validity closely PP’ 
Mates the name. e 
If a student was moving van Lapa 
© junior status, either within t on college, 
by means of a transfer from 4 ser Status Ex- 
© was required to take the Jun t examina- 
amination, This is an achieven type con- 
tion battery of the multiple choice SE A 
Sisting of five separate tests, eac 


specific field: Chemistry, Physics, Mathemat- 
ics, English, and Drawing. The mathematics 
and drawing tests are University of California, 
College of Engineering Examinations, while 
the other three are from the Cooperative Test 
Series, Higher and College Level. 

In addition to these examinations, the stu- 
dent’s previous high school grade record was 
used for evaluation and admission. For ad- 
mission purposes the high school record was 
divided into two categories: those subjects 
which were classed as liberal arts and those 
subjects which were termed pre-engineering, 
The pre-engineering group of courses consisted 
of mathematics, physical sciences (chemistry 
and physics), mechanical drawing, and Eng- 


a. Harry W. Case 


lish. The courses remaining after these were 
deducted from the transcript were loosely 
classed as liberal arts. 
Tetrachoric 7’s and the Standard Error * for 
each of these 7’s were calculated between the 
entrance devices described above and the first 
four semesters of work as well as the grouped 
subject areas of chemistry, mathematics, phys- 
ics, and drawing. These correlations, as well 
as the internal interrelationships, are shown 
in Table 1. 


Criteria 


The question of the reliability of specific sub- 
ject grades as well as the reliability of the 
semesters’ grades needs to be considered, be- 
cause if the reliability is low, little can be done 
to increase the validity of a test for selection 
purposes. The high intercorrelations existing 
between chemistry and physics grades .80, 
chemistry and mathematics grades .71, and 
physics and mathematics grades .86, would 
seem to indicate that some reliability exists, 
since the material learned in these three 
courses is related. Similarly, the fairly con- 
sistent intercorrelations between the first four 
semesters would appear to substantiate this 
belief. If the intercorrelations between semes- 
ters may be taken as an index of reliability, 
the reliability may be said to be as high as 
the relationship obtained between the measur- 
ing instruments and the criteria, i.e., tests and 


high school grades versus college semester and 
subject grades. 


Results 


An examination of the correlations in Table 
1 reveals some interesting trends. One of the 
first immediately noticeable is the magnitude 
of the intercorrelations existing between the 
sections of the P.E.I. (Pre-Engineering Inven- 
tory), when these are compared with the ?’s 
existing for the various sections of the Junior 
Status Examination. One possible explana- 
tion for this difference is that it is easier to 


1 The SE’s were estimated by applying the formula: 


ye EE 
VN=1 
According to Garrett (2), “An approximation to the 
SE of a tetrachoric r may be found in the following 


way: the r+ is about 50% higher than the SE of an 
equivalent product-moment rf... .” 


segregate the subject areas covered in an 
achievement type examination into measur- 
able units with little overlap. The substantial 
overlap between sections of the P.E.I. arouses 
a question as to the success with which it will 
predict the grades for a total semester and as 
to the differential value of its various sections 
for specific subject areas. r 

Both the P.E.I. Total score and Composite 
score predict the four semesters’ grades fairly 
well, the one exception being a .32 correlation 
between the first semester’s work and the 
P.E.I. Composite score. The other correla- 
tions between the semesters’ grades and the 
P.E.I. Total and Composite scores range from 
.46 to .54, which is close to the median of 60 
reported by Johnson for twelve engineering 
colleges (3). On the other hand, the various 
sections of the P.E.I. show fairly low correla- 
tions with grades for specific subject areas: 
The highest single correlation existing betwee” 
a subject area and a section of the P.E.I. was 
46 for the P.E.I. Scientific Materials an 
Physics grades. This same lack of relation- 
ship and inability to discriminate betwee? 
subject areas was found by Moredock (4)- 
would appear, therefore, that while the test 
shows usefulness in predicting over-all success 
for the first two years of engineering curricu- 
lum it is of little use in evaluating potent@ 
student success for specific subjects. 

At this point it is perhaps desirable to note 
that the P.E.I. Total and Composite scores 
show little relationship with grades obtain’. 
in high school “pre-engineering” subjects; rt 
with grades obtained in those high school sU 
jects which have been designated as “liber? 
arts.” This low intercorrelation, combine 
with the fact that the “pre-engineering” b'8 
school subjects relate at a median of .42 wit’ 
the first four semesters of college work, has 4” 
lowed the P.E.I. Total score and the i 
engineering” high school grade scores tO, 
combined into a successful selection devic” 
It would appear desirable eventually to desig 
a new examination which would show a 
overlap between its sections and greater diffe 
ential value for subject areas. ‘of 

In an analysis of the results of the pea 
Status Examination, which is of the achiev” 
ment type, it will be seen that the interc® 


An Analysis of Engineering Entrance Examinations 45 


relations for its sections range from .11 to .51. 
These low intercorrelations may in turn be 
responsible for the higher 7’s of the examina- 
tion with the specific subject areas. The dif- 
ferential value is highest for chemistry, mathe- 
matics, and drawing. The physics section of 
the examination, which has its highest correla- 
tion with chemistry grades, was a shortened 
Version restricted to Mechanics and Electric- 
ity. The intercorrelations between both “pre- 
engineering” and “liberal arts” high school 
Subject grades with the total score of the 
Junior Status Examination are also greater 
than those obtained with the P.E.I. The cor- 
relations of the examination and the first four 
Semesters of work range from .31 to .74. 
_ Two additional correlations which are not 
ìncluded in Table 1 have been obtained for the 
total of the Junior Status Examination and 
Success in the fifth and sixth semesters of 
Work. The correlation between the fifth se- 
Mester of work and the examination is .65, 
and the sixth semester is 58. These two 7’s 
Were obtained by the Pearson product moment 
method and are based upon 100 and 68 cases 
respectively, 

It should perhaps be noted that while this 
Paper has been devoted to an analysis of the 
interrelationships existing between the results 
of the examinations and high school grades 
When used for entrance evaluation and the 
8rades received in the first two years of engi- 
Neering college, the examinations have proved 
Useful in many other ways that are difficult to 
quantify. For example, information obtained 
from one of the examinations has been used in 
Conjunction with a diagnostic interview to de- 
ermine the areas in which remedial work is 
Needed, 


Conclusions 


1. The Pre-Engineering Inventory shows a 
consistent correlation with the grades from the 
first four semesters of work, which makes it 
useful as a selection device. 

2. The sections of the Pre-Engineering In- 
ventory show no clearly defined relationship 
with specific subject areas, which would make 
it useful for differential selection within engi- 
neering. 

3. The Pre-Engineering Inventory shows a 
low interrelationship with “pre-engineering” 
subject high school grades. 

4. High school “pre-engineering” subject 
grades show a consistent correlation with 
grades from the first four semesters of engi- 
neering college work. 

5. An achievement examination such as the 
Junior Status Examination shows both greater 
differential value and greater over-all relation- 
ship with semester grades. 


Received March 10, 1952. 


References 


1. Case, H. W. The utilization of psychology in 
enginecring courses. Amer. Psychologist, 1951, 
9, 494. 

2. Garrett, H. E. Statistics in psychology and educa- 
tion. New York: Longmans, Green and Co., 
1947, 

3. Johnson, A. P. Tests and testing programs. J. 

Engng. Educ., 1951, 41, 277-282. 

. Moredock, H. S. Special abilities in pre-engineer- 
ing studies. Unpublished Doctor's dissertation, 
Univ. Calif., 1950. 

5. Remmers, H. H., and Geiger, H. E. Predicting 
success and failure of engineering students in 
the school of engineering in Purdue University. 
Purdue Univ. Stud. Higher Educ., 1940, 38 
10-19. : 


= 


Tue JOURNAL OF APPLIED PsycHoLocy 
Vol. 37, No. 1, 1953 


Differential Sex Responses to Items of the MMPI 


L. E. Drake 


Student Counseling Center, University of Wisconsin 


In a rather extensive study of the MMPI 
some incidental evidence has come forth which 
appears important enough to warrant an early 
report. The frequency of Yes, No, and ? re- 
sponses to each of the 550 items of the card 
form was obtained separately for 2,270 under- 
graduate male students and for 1,148 unmar- 
ried, undergraduate female students. Al 
were enrolled in the University at the time of 
testing and none obtained an L score over 70 
or an F score over 80. 

Excluding items to which 90 per cent or more 
of both groups responded in the same direc- 
tion and excluding those to which 10 per cent 
or less responded in the same direction, 306 
items yielded critical ratios between the sexes 
ranging from 2.0 to 32.3. A total of 43 items 
was selected from these 306. These were 
items that 50 per cent or more of the females 


1 Tables of frequency counts for each item by sex 
have been deposited with the American Documenta- 
tion Institute. Order Document 3860 from American 
Documentation Institute, c/o Library of Congress, 
Washington 25, D. C., remitting $1.25 for photocopies 
(6 X 8 inches) readable without optical aid or $1.25 


for microfilm (images one inch high on standard 35 
mm. motion picture film), 


46 


responded to in a direction in which less than 
50 per cent of the males responded. Answer 
sheets for 100 males and 99 females not mM- 
cluded in the original groups were scored for 
the 306 items and the 43 items. The resulting 
coefficient of correlation was +.80. 

Answer sheets for 3,229 males and 1,612 fej 
males were then scored with the 43-item key- 
Only 2% of the females obtained a score as 
small as or smaller than the mean score for 
the males and only 2% of the males obtaine 
a score as large as or larger than the mean 
score for the females. That this sex differen?’ 
is reliable is further indicated by the fact they 
a coefficient of correlation of +.80 was ie 
tained between scores (43-item scale) the 
tained by 474 males and 224 females on t 
group form taken at time of entrance to t : 
University and the card form taken up to 9” 
year later. t 

It is quite apparent that sex is an importa?’ 
factor in establishing criterion groups, eSP? 


; S f 
cially for scale construction for this type ° 
inventory. 


Received October 23, 1952. 
Early publication. 


Tue 


wa JOURNAL or Appiiep Psycuonocy 
ol. 


37, No. 1, 1953 


A Study of Medical Students with the MMPI: III. 
Personality and Academic Success 


William Schofield 


University of Minnesota 


Two previous papers in this series have re- 
Ported general normative data for samples of 
Medical students studied with the Minnesota 
Multiphasic Personality Inventory (1) and 
the nature of changes in the MMPI profiles of 
Students from the freshman to junior years of 
the medical curriculum of the University of 

innesota (2). This paper, the last in the 
Series, is concerned with the relationships be- 
tween MMPI profiles and academic success. 


Class Standing and MMPI Profile 


The Dean of the Medical Sciences provided 
data on the total honor point ratios at the 
Completion of the junior year of the members 

the class used in this study. The Student 

Ounseling Bureau provided the American 

-uncil on Education Psychological Examina- 
tion (ACE) scores of the students. With 
these data at hand, it was possible to select 
Students from the upper and lower quarters of 

€ class who were matched for scholastic 
SPtittide: as: measured by the ACE. These 
matched groups were then studied for similar- 
"Hes and differences on the MMPI. : 

The forcing of homogeneity on the aptitude 
Variable resulted in very small samples (11 
ents each) from the upper and lower gor 
» but this was considered preferable to the 


ters 
eset larger N’s with uncontrolled aptitude 


is ‘lance, The differences between uae 
Ores nd lower : 
of matched upper a etende 


„Cents ranged from zero to nine P 4 
arints, with a average difference of five per 
“ntile Points, : at 
“gure 1 shows the mean freshman oat 
Plog l8S of the upper and lower idl 
Station hree of the clinical scales ee mean 
seg Stically reliable difference in the rter 
ee of the two groups. The lower qua Aa 
HoP haq reliably higher mean scores E re 
s , and Sc scales. Table 1 presents 


“ata for these comparisons. The differences 


Pri 


between the two groups are seen to be limited 
to a very few scales and, while statistically 
reliable, are not great. In general, as seen in 
Figure 1, the profiles of the upper and lower 
quarter students are similar, particularly in 
terms of the relative elevations of the “char- 


Hy Pa MI Po 


D P 
| | I z= 
| | | =| 
| | By 
P Eps, - 
| al 
ede 
| 
| 
ine 
| HHH 


Fic. 1. Mean freshman year MMPI profiles of 
samples of medical students from the top and bottom 
quarters of their class at the end of the junior year, 


(N=11). 


acter structure” scales of the right side of the 
profile. To the degree that the good and 
poor achievers are distinguished by the 


- MMPI, the distinction appears in the relative 


47 


degree of hysteroid, psychopathic, and schiz- 
oid tendencies; the poorer achievers are char- 
acterized by a tendency to unrealistic ap- 
praisals of their environment, unhappy social 
relationships, and autistic rumination. Also, 
the poor achievers show a general tendency 
toward a relatively unsophisticated denial of 
personal weaknesses and the expression of an 
idealized self-concept (L). In this regard, the 
students who work up to capacity tend to 
manifest a more realistic self appraisal, 

As an approach to testing the predictive 
significance of the group differentiations 
turned up in this comparison of mean profiles 
the total class of students was sorted into ore 

roups: (1) a group each member of which 
had an MMPI profile characterized by one 
or both of the two highest scores falling on the 
Hy, Pd, or Sc scales; and (2) a group whose 


48 


William Schofield 


Table 1 


Means and Standar Fr MM J a 1 
iati e or Samples of Upper Quarter and 

M d Standard Deviations of the Freshman PI Scores fo n 

ý Lower Quarter Medical Students Matched for Scholastic Aptitude! (N = 11) 


Group 


Upper Quarter 


Lower Quarter 


Scale Mean Sigma Mean Sigma t F 

Le 12 14 2.7 1.4 4.26** ne 
F 3.7 3.2 3.2 2.6 46 fe 
K 57.1 5.9 61.0 7.4 1.48 a 
Hs 47.4 7.8 50.9 7A 1.03 1. A 
D 47.4 9.5 50.5 8.8 .65 ae 
Hy 49.3 7.0 57.4 6.8 3.08* 1.0 

Pd 53.0 6.7 57.1 10.1 3.03* 2.29 
Mi 64.0 6.9 62.0 13.4 36 3.74 
Pa 49.7 8.6 50.0 7.2 10 1.44 
Pt 50.9 6.3 54.6 9.0 1.09 2.07 
Sc 51.0 5.2 55.4 7.4 2,06* 2.01 
Ma 59.7 10.0 63.5 7.3 1.28 1.88 


1 Academic standing determined fr 
students matched for ACE. 

? Statistics based on raw score 

* Significant at 5% level. 

** Significant at 1% level. 


profiles did not have the above characteristics. 
From these two groups, two smaller samples 
were drawn so that each member from the 
group with Hy, Pd, or Sc high points was 
matched with a member from the other group 
for ACE score. Then a study was made of the 
honor point ratios of these two samples which 
were equated for scholastic aptitude but dif- 
ferentiated by the presence and absence of 
certain scales as high points in their MMPI 
profiles. Table 2 reports the mean honor 
point ratios (HPR) of these two samples. 
The group characterized by profiles with high 
Points on the Hy, Pd, or Sc scales yielded 
a mean HPR clearly inferior to that of the 
group not so characterized, However, since 
the variances of the two groups differed Signifi- 
cantly, it was not Possible to test for the reli- 
ability of the difference between the means. 
It may be concluded, nevertheless, that these 


port the hypothesis that the 
ich they are representative 


have identical distributions of honor point 


ratios. 
Figure 2 shows the actual distributi 
ution of 
HPR’s for the two samples. The greater 


5 : z ver quarter 
om honor point ratio at end of junior year. Upper and lower Q 


P Be eed are sete 
data; for raw scores on L which are less than 3, arbitrarily T scores of 50 @ 


rr 
range of achievement in the group not ns 
acterized by high points on Hy, Pd, ES cleat 
clear from this figure. While there is yine 
overlap of the two distributions, a cutting ud 
at HPR = 1.6 shows only 23% of the PR’ 
with the specified high points to have H thet 
larger than this value, while 62% of the © 


Table 2 


r point 
Means and Standard Deviations of the a its 
Ratios of Two Groups of Medical Stude 
Differentiated by MMPI High Points 
and Matched for ACE Scores 


Honor Point 


je 
Ratios _ ACE We 
sig” 
Groupe N Mean Sigma Mean 78 
z a 
A 21 1.48 22 6T gy9 
B 21 1.80 36 63.9 
F = 2.68* f 
f 
* Group A had profiles for which one or both scale 
two highest scores fell on the Hy, Pd, or Se ad 
Group B did not have either of their two highes 
on the Hy, Pd, or Sc scales. 


» Total honor point ratio at end of junior year’ 
Significant at 5% level. 


A Study of Medical Students with the MMPI. III 49 
Group A 
N= 21 
x 
x 
x x x 
X x x x X x x 
x x X x x x x X x 
10 21 22 i% 14 RS 16 tf 18 19 20 21. 22 23 M 25 
Group B 
- N=21 
0 
0 0 
0 o o 0 0 0 0 0 
0 o o 0 o 0 0 o o 
10 ii 12 13 14 LS 16 17% £8 19 20 B1 22 28 24 %5 
Honor Point Ratios 
Fic. 2. Distribution of honor point ratios of two groups of medical students differentiated by 
MMPI high points and matched for ACE scores.* 
_“ Group A had profiles for which one or both of the two highest scores fell on the Hy, Pd, or Sc scales. Group 


B did not have either 0 
group fall above HPR = 1.6. At the other 
end of the distribution, while 43% of the Hy- 
Pd-Se high point group have HPR’s < 1.5, 
Only 14% of the other group fall below this 
Score. It appears that the presence of the 
highest or next to highest score of a medical 
Student's profile on the Hy, Pd, or Sc scales 
'S highly predictive of underachievement. 
Another approach to this study of relation- 
Ships between personality and medical school 
achievement was made by studying the rela- 
tive success of students with deviant MMPI 
Profiles and those with profiles within the nor- 
Mal range. There were 18 members of the 
Class ( 21.6%) who had freshman MMPI pro- 
es with at least one of the clinical scales 
Showing a T-score of 70 or greater. Fifteen 
Of these constituted the “deviant” sample. 


Piftee : files entirely within 
n students with pro d so as to be 


p for ACE 


Table 3 
‘os for the 
roups at the 
5 shows the 
“non- 
n profile of the 


mea 
The fact that ‘ten ° 


f their two highest scores on the Hy, Pd, or Sc scales. 


the fifteen students in this group had a score 
of 70 or greater on the Mf scale. It is obvious 
that the two groups are essentially identical 
in their academic performance as expressed by 
the honor point ratio. 

It was considered of interest to make one 
additional study of MMPI profiles and 
achievement. This was a study of the rela- 
tionship between medical school class rank at 
the end of the junior year and the amount of 
difference between the freshman and junior 
year MMPI profiles. For this purpose the 
same matched samples of upper and lower 
quarter students for whom mean honor point 
ratios are reported above were used. F igures 
4 and 5 indicate the freshman and junior pro- 
files of these two samples. It is quite clear 
that the upper quarter students show a much 


Table 3 


Means and Standard Deviations of Honor Poi i 
(Junior) of Medical Students with “Deviant” os 
“Non-Deviant” Freshman MMPI Profiles 


ACE i 
Percentile i ne 
Group N Mean S.D. Mean S.D 
ame 15 734 548 1.58 49 
Non-Deviant 15 727 523 1.56 32 


? LF K Hs D Hy Pd Mf Po Pt Sc Ma 


| 
110 |i | 2 al 
100 


p Ri = 


Fic, 3. Mean MMPI profile of a sample of fifteen 
medical students having at least one clinical score 
over T = 70, and mean profile of fifteen students with 


no deviant score, both samples matched for ACE 
scores. 


greater change in their mean profile over the 
two year interval than do the lower quarter 
students. The top quarter sample showed a 
reliable increase in mean score on the Sc scale, 
and statistically significant decreases in means 
on the Mf and Ma scales. Thus, the top 
quarter students showed a tendency after two 
years in the medical curriculum toward a “de- 
femininization” of their interest and activity 
pattern (Mf) although remaining clearly de- 
viant from the general population males. 
Likewise, their morale, optimism, enthusiasm 
and self confidence showed a drop toward the 
general population norm (Ma) which may re- 
flect a more realistic appreciation of their ca- 
pacities and the demands of medical training. 
The increase in Sc suggests a tendency to 
greater self analysis and general philosophical 


probing which is probably in line with the 
drop in manic features, 


By contrast 


the bottom quarter sam le re- 
vealed little te j k 


endency to reliable change over 


?LFK 


Hs D Hy Pd ME p 
d MI Po Pt Se Mo 


William Schofield 


- the two year interval, the sole change having 


statistical reliability being a drop in the va 
score suggestive of mild deterioration 9 

morale. Tables 4 and 5 present the means 
and standard deviations of the scale scores o 
both years and both samples together wit 

measures of the reliability of the freshman- 
junior differences. 

As a further check on the relationship ze 
tween amount of change in MMPI profile an 
academic performance a scattergram was prê- 
pared to show the joint distribution of these 
two variables for the entire class of 83 pe 
dents. The “change” score for each sare 
was obtained by adding, without regard 
sign, the differences between his freshman an 
junior scores on each of the nine clinical scales- 
For the total sample, this variable showed 4 
range of 29-118 T-score points, with a mE 
of 54.90 and a standard deviation of 17-10- 


LFK W Sc Mo 


no 
100 
90 
80 l 
70 | 
60|- 
50 ZA - 
40 


30 TT ] 


20 ae a 
oH pf S OO 

o Li | ; 
Fic. 5. Mean freshman and junior year MMP 


m 
Profiles of a sample of medical students in the bott 


: ar. 
quarter of their class at the end of the junior yen 
(N= 1i). 


D Hy Pd MI Po PI 
t 


The honor point ratios for the group had @ 
Tange of 1.1-2.8, with a mean of 1.71 an an 
sigma of .35. Inspection of the scatters‘ i, 
did not suggest any marked relationship J 
tween these two variables although it did ap 
pear that there was a slight tendency , 

higher honor point ratios to be associated Wi 

higher change scores. Table 6 indicates t 

means and sigmas of the honor point rat 
for the subjects having the 20 lowest and t 2 
20 highest MMPI change scores. The diffe? 
ence between the mean honor point ratios © 
these two groups is not statistically relia? e 
Comparison of the distribution of honor po 
A two groups revealed consid? 


A Study of Medical Students with the MMPI. III 51 


Table 4 


Means and Standard Deviations of Freshman and Junior Year MMPI Scores for a Sample of 
Top Quarter Medical Students (N = 11) 


Freshman Junior 
Standard Standard 

Scale Mean Deviation Mean Deviation t F 
L 1.2 14 iS 1:1 42 1.59 
| F 3.7 2.8 3.1 1.8 98 2.50 
K 57.1 5.4 61.6 7.2 1.66 1.79 
Hs 47.4 Gal. 50.0 5.6 1.06 1.57 
D 47.4 8.6 46.8 8.0 26 1.16 
Hy 49.3 64 46.2 77 1.38 1.45 
Pd 53.0 6.1 57.6 9.6 1.16 2.47 
Mf 64.0 6.2 59.3 7.4 2.65* 1.41 
Pa 49.7 7.8 51:3 6.0 -68 1.68 
Pt 50.9 5.7 52.3 64 56 1.25 
Se 51.0 47 56.0 5.7 Siar 1.47 

a 59.7 9.2 54.8 7.8 222" 1.38 

** Significant at 1% level. 
Table 5 


Means and Standard Deviations of Freshman and Junior Yea 
Bottom Quarter Medical Students (N = 11) 


r MMPI Scores for a Sample of 


Ma 
* Significant at 5% level. 


Freshman Junior 
Se pe p a 
Standard Standard 
Scale Mean Deviation Mean Deviation t F 
L 27 1.2 1.8 2.4 3.50" 
F 3.2 2.4 2.4 1.4 2.98* 
K 61.0 6.8 53.0 6.2 85 1.18 
Hs 50.9 6.4 51.7 5.1 03 1.60 
D 50.5 8.0 50.2 7.4 -01 1.15 
Hy 57.4 6.2 56.5 5.6 07 124 
Pd 57.1 9.2 58.6 7.0 25 1.72 
Mf 62.0 12,2 59.2 9.6 94 1.61 
Pa 50.0 6.5 20:2 32 13 1.21 
Pt 54.6 8.2 52.7 5.1 57 2.59 
Sc 554 6.6 57.5 68 70 1.05 
Ma 63.5 6.6 56.6 7d 2.50** 1.14 


a Significant at 5% level. 
* Significant at 1% level. 


d Conclusions 


Summary an 
ratio at the end of 


Using total honor point F 
ne he year of Ral school asa eite 
lon, an attempt was made to mes 

*elationship between personality eat | p 
řevealed in a freshman year MMEL prow 


was 
academic performance. Also a study 


made of the relationship between amount of 
personality change between the freshman and 
junior years and scholastic achievement, 
These analyses were based on data for 83 male 
students who entered the University of Min- 
nesota Medical School in 1946. 

1. When the average profile of upper quar- 


52 


Table 6 


Means and Standard Deviations of Honor Point Ratios 
of the 20 Students Having the Lowest Amount of 
MMPI Score Change, Freshman to Junior 
Year, and of the 20 Students with 
Greatest Change 


Honor Point Ratio 


Group N Mean Sigma 
Low Change 20 1.7 -28 F = 1.746 
High Change 20 18 3 t= .84 


ter students was compared with that of lower 
quarter students, with the subjects of the two 
samples matched for ACE scores, certain of 
the scales revealed reliable mean differences 
between the two groups. The scales yielding 
reliable differences between the top and bot- 
tom quarter samples were Hy, Pd, and Sc. In 
general, the low quarter students revealed a 
tendency toward greater neuroticism and de- 
fection in interpersonal and social relation- 
ships. 

2. When students were separated into two 
groups with members of the groups matched 
for academic aptitude (ACE) but differen- 
tiated by the occurrence and non-occurrence 
of high points of their profiles on the Hy, Pd, 
or Sc scales of the MMPI, it was found that 


demic performance. 
dents each of whom h 
Score was compared with a 
elevations, with members of 

the t 
matched for ACE, the rit a: 


the two groups were f 
identical. 


William Schofield 


4. It was found that the samples of 
and fourth quarter students, equated for a 
scores, were very different with respect to a 
amount of change in their respective MM 
profiles from the freshman to the junior yen 
The top quarter students revealed a relia ; 
change in mean score on three of the ee 
clinical scales (Mf, Sc, and Ma). The bottom 
quarter sample showed reliable fveshmsn 
junior changes only in decrease in their } 
score. 

5. When a comparison was made of a 
honor point ratios of students having t 
largest amount of change in their aca 
MMPI scores from the freshman to the oe 
years and the honor point ratios of studen a 
with the smallest amount of change, it ve 
found that the two groups had essentia y 
identical honor point ratios and there was CO” 
siderable overlap between the two groups- a 

6. In general, it appears that when F: 
demic aptitude is constant, the likelihood 4 
achievement up to capacity in the medical oa 
ticulum becomes less as hysteroid, pe 
pathic, and schizoid traits, measured by, e 
MMPI, are greater, It may be hypothes!” 
that students who show both a restrict? 
Scholastic promise and marked deviation ° 
the Hy, Pd, or Sc scales would be particular 
Poor academic risks. In the absence of e, 
limitation of academic aptitude, the adn 
sion to medical training of students show! 
chief deviations (even though within ri- 
“normal” limits) on the Hy, Pd, and Sc v4 
ables would appear to make for a lowering 


a 
the general level of scholarship of the medic 
school class. 


Received A pril 28, 1952. 


References 


itb 
1. Schofield, W. A study of medical students va 
the MMPI: I. Scale norms and profile 


terns. J, Psychol., 1953, 36, in press- 
2. Schofield, W, A stu 


the MMPI: IJ. 
after two years, 


jth 
dy of medical students T 
Group and individual chê 


(In press.) 


Tue JourNat or APPLIED PSYCHOLOGY 
Vol. 37, No. 1, 1953 


A Note on Ranking Method 


Douglas Irvine 
Army Operational Research Group, Surrey, England 


Kendall (2, p. 89) mentions: “. . . the 
desirability of examining the primary data to 
see if there are any obvious effects present.” 
The present note aims to enlarge upon this. 
Ranking is often a quick and meaningful 
method of obtaining various types of psycho- 
logical data. Sometimes subjects are asked to 
Tank various items in order of preference, such 
as those in the Job Preferences Scale of Jur- 
8ensen (1). Usually, in order to obtain some 
Sort of over-all picture such rankings for each 
item are summated, and mean ranks are then 
calculated. After this the mean, or total 
Scores are placed in rank order once more. 

. Such a procedure may conceal important 
information. This is becoming apparent in 
data being accumulated on an Anglicized ver- 
Sion of Jurgensen’s Scale, but, as the number 
Of subjects in this study is small, the data will 
Not be published at present. It seems that 
Such data should first be arranged to show the 
umber of times each item is placed in each 
Tank. In other words a frequency distribu- 
tion should be made for each item. This dis- 
tribution may be turned into a graph for those 
Who Prefer to look at their results in this way. 

The following example should make this 
Clear, although it is much more simple than 1s 
Usually encountered, since preferences are 
Asked concerning three items only. Wyatt, 
“Angdon, and Stock (3, p. 12) asked 19 ar 
tives engaged on chocolate (candy) packing 
Lek in order of preference three sizes © 

Oxes, viz., 14 lb., 4 1b., and 1 lb. {ine situ- 
lon is somewhat unreal in Britain today.) 

Cir results are as follows: 


Ist 2nd 3rd 
Large boxes 10 I 8 
Medium boxes 3 14.2 
Small boxes 6 4 9 


If these figures are summated, their mean 
ranks calculated, and the boxes arranged to 
give a final ranking for all operatives, we have 
the following results: 


fx x Rank 
Large boxes 36 1.89 1 
Medium boxes 37 1.95 2 
Small boxes 41 2.16 3 


This latter table suggests that there is little 
difference in the over-all order of preference, 
whereas in fact the operatives tend to either 
like, or dislike, both the large and the small 
boxes, according to individual choice, while 
most of them place the medium boxes in the 
middle. The investigators found this to be of 
some importance, as there was a close corre- 
spondence between output in packing one type 
of box and preference for that box. This, 
however, is not the place for an argument into 
which is cause and which effect. The example 
is merely an illustration of Kendall’s plea. 


Received April 24, 1952. 
References 


1. Jurgensen, C. E. Selected factors which influence 
job preferences. J. appl. Psychol., 1947, 31, 
553-563. 

2. Kendall, M. G. Rank correlation methods. Lon- 
don: Chas. Griffin and Co., Ltd., 1948, pp. 
160. 

3. Wyatt, S., Langdon, J. N., and Stock, F, G, L. 
Fatigue and boredom in repetitive work 
I.H.R.B. Report No. 77, London, HMSO, 


pp. 86. 


53 


Tue JOURNAL oF APPLIED PSYCHOLOGY 
Vol. 37, No. 1, 1953 


Identification of American, British, and Lebanese Cigarettes 


E. Terry Prothro 


American University of Beirut, Lebanese Republic 


Habitual smokers generally believe that 
they can differentiate between various brands 
of cigarettes. In many countries a smoker has 
a wide choice of both domestic and foreign 
cigarettes, and there is often a substantial dif- 
ference in price between one brand and an- 
other. Obviously smokers must believe in a 
discriminable superiority of the more expen- 
sive brands if these brands are to be smoked 
for reasons other than “conspicuous consump- 
tion.” Advertisers, of course, encourage the 
belief that different cigarettes have unique 
characteristics. 

Investigations, however, by American psy- 
chologists throw considerable doubt on the be- 
lief that cigarettes can be identified by persons 
who do not know the brand they are smoking. 
Hull (1) found in the course of investigations 
on another problem that his Ss frequently 
failed to distinguish between tobacco smoke 
and warm moist air if visual cues were elimi- 
nated by a blindfold. Husband and Godfrey 

(2) requested blindfolded Ss to identify five 
American brands of cigarettes, and found that 
performance was only slightly better than 
chance on all brands except a mentholated 
one. 

More recently Ramond, Rachal and Marks 
(3) examined the ability of habitual smokers 
to identify three popular American brands. 

They gave each of their subjects a practice 
smoking session during which there was op- 
portunity to study the characteristics of the 
three brands. They did not blindfold their 
subjects on grounds that “a blindfold ob- 
scures the central problem.” Thus the sub- 
jects could examine the texture of cigarette 
paper, the color and size of the tobacco shreds, 
etc. During the test session gummed labels 
were placed over the brand names. Their sub- 
jects were able to identify each of the three 
brands slightly more often than chance. 
Smokers who preferred one of the three brands 
were able to identify that brand significantly 


more frequently than could smokers of other 
brands. ; 
If we grant that even habitual smokers m 
America have difficulty in distinguishing be- 
tween American brands of cigarettes, two 
questions present themselves. Is the difficulty 
a result of similarity of all tobacco smoke? 
Does the fact that American subjects in these 
experiments tend to smoke one brand of ciga- 


_rettes to the exclusion of others affect their 


ability to identify non-preferred brands? 

The situation in Lebanon is well suited to 
a preliminary investigation of these questions- 
Both American and non-American brands are 
used extensively, and the difference in price 
of various brands causes college students tO 
vary the brand purchased as their own finan- 
cial status fluctuates. Also there is some 
fluctuation in availability of brands on the 
market. 

In Lebanon, as in most of the Arab Neat 
East, American, English and domestic ciga- 
rettes are available. Of these the America? 
cigarettes are the most expensive. English 
brands cost about 10 per cent less, Domestic 
cigarettes—made from tobacco grown in the 
Near East—are about half as expensive 4° 
American brands. The sale of cigarettes !$ 
under control of a government-supporté 
monopoly which establishes prices and dete?” 
mines what cigarettes are to be imported. 
the present time two American brands, Camè 
and Lucky Strike, and two English brands, 
Players and Gold Flake, are found on h 
market. 


Procedure 


Subjects were 50 male college students who 
stated that they smoked at least five cigarette 
per day. They were obtained by asking a4 
Volunteers from the student body of the Annet 
ican University of Beirut. d 

Each S$ was brought into a well-ventilate 5 
room and shown a table on which there wee 
six packages of cigarettes. There was °” 


54 


Identification of American, British, and Lebanese Cigarettes 55 


a aa of the following brands: Camel, 
ord a Gold Flake, Players, Bafra and 
me choice of American and English 
Sumers is s based on availability to con- 
among th afra and Star were selected as being 
rettes K most popular of the Lebanese ciga- 
rettes of n effort was made by E to use ciga- 
e would b ual freshness. The S was told that 
each of he Presented with one cigarette from 
Was to rd six packages in turn, and that he 
mediate] ry to identify each cigarette im- 
at eh after smoking it. He was warned 
Not cha | Suess was final and that he could 
after SAA his opinion about one cigarette 
Suessed B ing some of the others. Thus if he 
decideq afra for the first cigarette and then 
f that the second cigarette was actually 
ond rs ma he was permitted to name the sec- 
ess afra” but not permitted to change his 
On the first cigarette. 
e S was next asked which cigarette he 
Then he was seated and blind- 
Cigarettes were placed in wooden 
as plac hich were 6 cm. long. The holder 
Was lit in the S’s mouth and the cigarette 
touch o or him. He was not permitted to 
Soon Stes see the cigarette at any time. As 
Moved a e identified the cigarette it was Te- 
Was t nd placed in a water-filled can. The 
Water fo a permitted to rinse his mouth ata 
Too untain just outside the experimental 
from th Pproximately two minutes elapsed 
the thee time one cigarette was identified until 
Use of hi One was lit. It was hoped that the 
lindfolds and holders would minimize 


available cues so that successful identifications 
not attributable to chance might be attributed 
to the qualities of the smoke itself. 

The order in which the cigarettes were pre- 
sented varied from subject to subject, and was 
determined by use of a table of random num- 


bers. 
Results 


It can be seen from Table 1 that our sub- 
jects were able to identify the American and 
English brands about half of the time and to 
identify the Lebanese brands even more often. 
Bafra was the most easily identified of these 
brands. Only seven of the subjects failed to 
identify it. Of the 300 attempts at identifica- 
tion, 180 or 60 per cent were correct. These 
results are considerably better than chance, 
and the superiority to chance is highly signifi- 
cant statistically. The value of chi-square for 
Table 1 is 462. This value is much too large 
to be found in the average table of chi-square. 
Moreover, each brand was identified better 
than chance. If we consider only the cells 
which pertain to correct identification, we find 
that the value of chi-square for these cells 
varies from 32 to 122. All of these values are 
highly significant. ‘ 

From these results it appears that all ciga- 
rette smoke is not the same. Habitual smok- 
ers can differentiate between these six brands. 

The difference between our results and the 
conclusions of Husband and Godfrey, and of 
Ramond et al. might lead us to conclude that 
the cigarettes which are popular in the Near 


Table 1 


ects Giving Each Response # 


fter Smoking Each of the Brands 


Number of Subj 


Brand Named by Subject 


Gold 


B Lucky i 3 
= Smoked Camel Strike Flake Players Bafra Star 
nel 5 3 12 3 3 

24 
Lucky Strike 8 28 8 4 1 A 
wana 25 8 1 2 
ake 7 7 
layers f f, 3 27 3 2 
Si 0 1 il 2 43 A 
ži 2 0 8 
3 4 3 33 
Total 48 50 47 53 56 ab 


56 E. Terry Prothro 


East are less similar to each other than are the 
popular American brands. There is, however, 
another possible explanation. Lebanese stu- 
dents vary the brands smoked to a greater 
extent than do American students. Conse- 
quently the Lebanese students may be better 
able to differentiate cigarettes because of a 
more varied experience. 

In this connection it should be noted that 
our Ss identified the American brands quite 
successfully. There was little confusion be- 
tween Camels and Lucky Strikes. 

The results of Ramond et al. support the 
thesis that preference determines identifiabil- 
ity in America. Their Ss could identify the 
brand which they preferred more than 70 per 
cent of the time. On the other hand, those Ss 
who preferred a brand other than the ones 
used in the experiment averaged only 20 per 
cent correct identification, although chance 
performance was 33 per cent. 

.Of our Ss, 44 expressed a preference for one 
of the six brands and 26 (nearly six-tenths) 
of these were able to identify the preferred 
brand. When we recall that exactly six-tenths 
of all 300 trials were correct, it is apparent 
that our students could identify non-preferred 


brands as readily as they could identify pre- 
ferred brands. 


Summary 


A total of 50 male students at the American 
University of Beirut who smoke at least five 
cigarettes per day were asked to discriminate 
between six brands of cigarettes which are 
popular in the Near East. Of the six brands, 


two were American, two British and two 
Lebanese. Ss were blindfolded and presented 
with six cigarettes in succession. They were 
required to guess at the identification of each 
brand before proceeding to the next. All ciga- 
rettes were presented in wooden holders. , 

Ss were able to identify each of the six 
brands significantly more often than chance. 
Of all attempts at identification, 60 per cent 
were correct. It therefore appears that ha- 
bitual smokers can discriminate between these 
cigarettes on a basis of the smoke alone. 

It was pointed out that the superior per 
formance of our subjects, even at distinguish- 
ing between Camels and Lucky Strikes, might 
be attributed to the tendency of Lebanese 
smokers to vary the brand smoked to 4 
greater extent than do American smokers- 
The results are compatible with this thesis, f0F 
our subjects were able to identify non-pre- 
ferred brands as readily as preferred brands. 
In contrast, a recent study (3) of American 
smokers demonstrated that they could identify 
the brand they preferred, but could not ide” 
tify other brands. 


Received March 5, 1952, 


References 


1. Hull, C. L. The influence of tobacco smoking °” 
mental and motor efficiency, Psychol. Mong» 
1924, 33, 161. al 

2. Husband, R. W., and Godfrey, J. An experiment? 
study of cigarette identification. J- app 
Psychol., 1934, 18, 220-223. R. 

3. Ramond, C. K., Rachal, L. H., and Marks, M. £ 
Brand discrimination among cigarette smoke! 
J. appl. Psychol., 1950, 34, 282-284. 


THE JOURNAL OF APPLIED PSYCHOLOG 
7 NA F APPLI y ocy 
Vol. 37, No. 1. 193: PsycHoLocy 


an v a patent belief of both job analysts 
is a illed workmen that the degree of skill 
T film by hand steadiness. This study is 
€mpt to investigate that belief. 

aten omy apparatus for measuring hand 
ae a (like the Whipple Tracing Board) 
a to measure predominately static 
obey Hoy or tremor. Skilled work involves 
Datat ional steadiness. Consequently, an ap- 
tës Us was devised to measure hand steadi- 
consi in three dimensions. This stasiometer 
ea of a 24” x 30” base on which are 
cop nted the ends of seven feet of 14-inch 
BoT tubing bent in three dimensions, an 
swit E counter, a transformer, and a knife 
i. - A brass  stylus-ring, through which 
of Fabia Passes, is connected to one terminal 
the e transformer and the copper tubing to 

Other through the counter. Contact be- 


t 5 
ween the stylus-ring and the tube activates 
© counter, See Figure 1 for schema. 


Fre. 1, Schematic diagram of stasiometer. 


ng the stylus- 
m one end to 


dly as possible 
cts. After 4 


seconds and 


ting a run consists of pass! 
the gone the bent tubing fro 
With ot and back again as rap! 
Pract; he least number of conta 

‘ce training run, the time 1n 


An Apparatus for Measuring Operational Hand Steadiness 


J. Stanley Gray, George Sustare, and Anthony Thompson 


University of Georgia 


the contacts (as recorded by the counter) are 
recorded for two runs, or four crossings. A 
minute rest is allowed between these two re- 


corded runs. 
Table 1 


Mean Time and Contacts for Two Runs 
on the Stasiometer 


Mean SD; 
First Run 
Time (Sec.) 80.9 31.7 
Contacts 115.0 34.9 
Second Run 
Time (Sec.) 78.9 29.6 
Contacts 107.2 35.3 


The test was administered to a norm group 
of 400 undergraduate university students (222 
men and 178 women). The mean time and 
mean contacts for each of two runs are shown 
in Table 1. A table of sigma values was then 
constructed and each raw score was converted 
into a Z score (mean 100, sigma 30). A Z 
score was thus obtained for each run (time 
plus contacts). The coefficient of correlation 
between these runs was .84 =.01, which was 
interpreted to indicate a satisfactory reliabil- 
ity. 

Tha Z scores for each run were then added 
to obtain a total steadiness test score. The 
distribution of the 400 norm cases is shown 
in Figure 2. 

Various factors which might affect steadi- 
ness were studied but only two showed any 
statistical significance. Sex was a highly sig- 
nificant factor, the men having 10 Z-score 

oints higher than the women. Smoking had 
only a slightly significant effect on steadiness 
as indicated in Table 2. G 

A total of 100 members of the norm group 
was given the Edwards’ finger tremor test. 
The coefficient of correlation was .004, indicat- 


1 Edwards, A. S. The finger tromometer, A 
J. Psychol., 1946, 59, 273-283. ah. 


57 


58 J. Stanley Gray, George Sustare, and Anthony Thompson 


FREQUENCY 


Table 2 


The Effects of Smoking and Sex on Operational 
Hand Steadiness 


Mean 
Steadiness 
N Z Score S.D. C.R. 
Smokers 225 101.9 20.3 1.27 
Non Smokers 175 99.2 21:5 
Male 222 105.1 19.2 4.9 
Female 178 95.1 21.3 


24- 36- 48B- 60- 72- 84 96- 108- 120-1352- 144- 
35 47 59 71 83 95 107 119 13) 143 155 
Fıc. 2. Distribution of 400 standard scores on the 


stasiometer, 


ing that hand operational steadiness and hand 
static steadiness are not related. 

Another group of 50 subjects was given the 
Purdue Pegboard dexterity test. The correla- 
tion of these scores with those on the stasi- 
ometer was .057. 

The stasiometer test was given to 50 skilled 
workmen (tool and dye makers, machinists, 


sheet metal workers, and welding inspectors). 
The average Z score for this group was 121.33 
S.D. 16.7, as compared with a mean of 100 an 
S.D. of 30 for the norm group. The CR of 
the difference between these averages was 7.6. 
Further validity data are being collected. | 
Apparently the stasiometer is a reliable 1n- 
strument for measuring operational steadiness 
and it may have some usefulness in selecting 
apprentices for various skilled occupations. 


Received April 17, 1952. 


Book Reviews 


Miller, Delbert C., and Form, William H. 
Industrial sociology; An introduction to the 
Sociology of work relations. New York: 
Harper & Brothers, 1951. Pp. 896. $6.00. 


The objective of this book is to present the 
ee tology of work relations. The word indus- 
qial is used as referring to all forms of eco- 
sand activity. Industrial Sociology includes 
ee of occupations and all social groups 
sub; affect work behavior. Conceiving the 
wi Ject from this point of view, the book deals 

a the interrelationships between the work 

havior of the individual and the other as- 
Pe of his social activities. “The Frame- 
ie of Industrial Sociology” (p. 30), as the 
«vC Major subdivisions of the book, include: 
iN Industrial Sociology: Its Rise and Scope; 
PI; The Social Organization of the Work 
rom III, Major Problems of Applied In- 
of tial Sociology; IV. The Social Adjustment 
the Worker; and V. Industry, Community 
and Society,” 
tet olving themselves with such broad in- 
Pre St areas without reaching encyclopedically 
Gite” detail will stimulate instructors and 
authe to judge the content and choice of the 
are Ors’ evaluative selections. Such selections 
of e emplified by the identifying of “The rise 
fa odustrial sociology . . .” (P- 3) with the 
et Hawthorne experiments; 4 sample 
Soci iculum for the training of an industria 
On posist (p. 86), and a chart, included both 
the front cover and on page 11, listing the 
& ronologica] “Outlines of the Main Streams 
eq, Putaries of Industrial Relations ape 
Cal Contributed by the Basic & Ae ee 
inte Ciences,” The authors have provide a 
ti testing base from which to work irrespse 
E Df the specific selections of the nder F 
Which ed. Of particular value to poles 
at the CVetlaps many disciplines is & 8°° 
€ end of the book. 4 Tat 
tial a volume of this size there 1$ muc m 
Consi 'ch any particular group of readers a 
Sider extraneous. The first 306 pages 8° 


5 ‘al. 

s fies with basic infon ae eared 
Udent some par’ 

Ue, Ents who have had Iustrial eco- 


c f i 
ation in labor movements, 1 


Prese 


59 


nomics, business administration, and courses 
in applied psychology, these materials may 
prove to be repetitious and tend to cause a 
general letdown before such students get to 
the more strictly sociological material. As 
was suggested in an article by the reviewer in 
The Journal of Educational Sociology, No- 
vember 1950, “. . . the basic principles un- 
derlying industrial sociology are composed of 
established sociological principles and that in- 
dustrial sociology represents a distinctive area 
of investigation for the sociologist which in 
large part he has left to economists and psy- 
chologists.” 

This volume is very comprehensive. As a 
reference for allied courses the comprehensive- 
ness of this volume has great value in pointing 
up the interrelationship of the associated as- 
pects of industrial relations. As a text for in- 
dustrial sociology this very comprehensiveness 
makes it somewhat difficult to point up the 
basic underlying principles of this interest 
area. However, the book is unquestionably a 
real contribution to the role and teaching of 


industrial sociology. 
Glaister A. Elmer 


Air University Far East Research Group 


IES Lighting Handbook. Second Edition. 
New York: Illuminating Engineering So- 
ciety, 1952. Pp. 974. $8.00. 

This new edition represents a thorough re- 
vision of the original Handbook which was 
published in 1947. Its objective is to provide 
its readers with essential information on light 
and lighting in simple terms and condensed 
style. The introductory chapters are con- 
cerned with the physics of light, light and 
vision, and nomenclature together with defini- 
tions and symbols. These are followed with 
several chapters dealing with measurement of 
light, color, light control, daylighting, light 
sources and lighting calculations. There are 
then several chapters dealing with lighting in 
yarious situations such as interiors, exteriors, 
highways, aviation, transportation, and pho- 
tography. The book is concluded with an ex- 
tensive appendix. manufacturer’s data (ad- 


60 


vertising) and an index. The numerous charts 
and illustrations are very useful and excel- 
lently done. es 

Although designed for use by illuminating 
engineers, there is much material included that 
can be useful to the applied psychologist. 
Special mention might be made of the sec- 
tions on light and vision, nomenclature and 
measurement. Much of these materials should 
be known by the psychologist who is dealing 
with illumination in relation to visual comfort 
and efficiency. Other sections of particular 
interest to psychologists are the chapters on 
color and on interior lighting. Materials on 
recommended and standard practices are not 
included but the bulletins in these areas are 
listed opposite the title page. 

The collection and organization of materials 
in this Handbook represents an extensive and 
difficult task. The committee in charge is to 
be congratulated on achieving an excellent re- 
sult. No illuminating engineer or psychologist 
interested in the applied aspects of lighting 
can afford to be without this reference book. 
Nevertheless, there are a few reservations that 
occur to the reviewer: (1) There is a tendency 
to neglect psychological factors in adjustment 
of the individual to the illumination of work- 
ing and living environments. In future revi- 
sions it might be well to include a chapter on 
this subject. (2) Considerable work in the 
field of illumination has been done by psy- 
chologists. Examination of the lists of refer- 
ences fails to disclose these reports except for 
rare instances. It would seem that the best 
results in lighting could be achieved by co- 
ordinating the work of engineers with that of 
psychologists, physiologists including medical 
men, and physicists. (3) The presentation of 
certain data may lead to misinterpretations. 
For instance, in presenting Weston’s data, 
curves for relative performance but not for 
actual performance are given. The uncritical 
reader might interpret the curves presented to 
mean that, if the illumination is high enough 
discrimination of the low contrast test object 
will equal that of the high contrast one. Ex- 
orginal report reveals thet che ee tHe 
a similar manner ie dat: etal aed Tri 
(Cobb, Ferree anid R Te Peen A meon 

> and) is plotted in terms 


Book Reviews 


of the reciprocal of the time. This produces 
an exaggerated picture of the improvement 
with increase in illumination intensity. 

Miles A. Tinker 


University of Minnesota 


Frederiksen, N., and Schrader, W. B. 
ment to college. da 
Testing Service, 1951. Pp. xvii + 504- 


Based on a study of 10,000 men veteran 
and non-veteran students in sixteen American 
colleges following World War IT, this book has 
much to contribute to current educational and 
psychological theory and practice. Even 
though the population studied may, it !$ 
hoped, never be duplicated, the extent an 
form of this investigation are such that its 
implications are and will be important. 

The book has a somewhat novel organiza- 
tion, for the whole study is summarized in the 
first chapter on a level clearly appropriate fo" 
the statistically untrained reader. In the Te 
maining chapters, the results are presented in 
generally simple tabular and graphic form; 
while the basic tables and methodologic! 
notes are contained in the appendices. 
though the first chapter is clearly intended o 
the lay reader, the level of difficulty of the res 
fluctuates somewhat more than would see™ 
desirable, and the college administrator WhO ie 
tempted to read on may encounter rough g° 
ing In certain places. a] 
i Two methodological points are of speci® 
interest. The authors use as the criterion ° 
academic adjustment an index called 
“Average Adjusted Grade,” a“... meast" 
of achievement-relative-to-ability. gaa 
1s a standard score based on analysis of cova! 
ance procedures and represents a significa” 
advance over other similar methods of co™ 
puting such indices. Secondly, they make we 
of a sign test in assessing group difference 

at samples from several coleg?” 


Adjust- 


recognizing th, 


may be taken as replications of the expe"? 
mental situation. This test makes for a pen 
sion too seldom encountered; it is hoped a e 
it and the above criterion method will rec® 
the attention they deserve. (7 
The content of the book is a detailed a 
Scription of the attitudes and behaviors © 4 
8roup of men veterans and non-veterans 


Princeton: Educational * 


Book Reviews 61 


a comparison between these. In the process, 
the authors dispose of a number of erroneous 
notions about these groups in particular and 
college students in general. For one thing, the 
Similarities reported are more evident than are 
ttie differences, and the authors wisely recog- 
nize the importance of such “negative” results. 
Consequently, the book contains many de- 
scriptions of generally applicable relationships 
arm lacks of relationship—between the cri- 
ten and such factors as extra-curricular ac- 
o tues, vocational decision, family income, 
utside reading habits, etc. 
Most of the book deals with the results of 
ee questionnaire intended to illumi- 
€ causes of obtained differences, if any. 
A authors are aware of the limitations of 
Stad method, and they report only what the 
si dents said, But it is easy to accept such 
atements at face value, something which 
a be, in view of what is known about test- 
t ing attitudes, seriously misleading. It is 
© be hoped that other investigators will follow 
© many interesting leads provided and con- 
duct studies employing more powerful tools. 
erred any complete picture of the problem of 
ege adjustment, there is a great need for 
eg integration of such studies with those by 
te such as Pressey whose work on educa- 
rf acceleration makes a fairly clear case 
sin avor of the younger collegian. However, 
Ce this book is not intended to be a sys- 
matic integration but rather an extensive 
*scriptive study, this should not be taken as 
Criticism of it but only as an indication of 
Pressing need for more work. 
ethodologically, this study should serve as 
Phe! for further research. As far as ns 
Sults are concerned, both college administra 
“ar and psychologists interested in human 
JUstment problems should find in it a very 
Breat deal that is of interest and value. The 
— Was imaginatively planned and carefully 
“cuted; the authors are to be commended 


OF tha: ; 
€ their excellent contribution. 


Jobn W. Gustad 


a SA 
niversity of Maryland 


Kelly, E. L., and Fiske, D. W. The prediction 
of performance in clinical psychology. Ann 
Arbor: The University of Michigan Press, 
1951. Pp. 311. $5.00. 


This volume is the report of an ambitious 
five-year research program during the period 
1946-1951, which was directed at the evalua- 
tion of techniques for the selection of graduate 
students for training in a four-year doctoral 
program in clinical psychology. 

The first section of the report is devoted to 
a description of the operating philosophy of 
the project, which was to be both catholic and 
eclectic in the selection of predictors and cri- 
teria, a discussion of the sequential phases of 
the research program, and a presentation of 
normative data descriptive of the 700 subjects. 
Each subject was enrolled in one of 40 univer- 
sities and had field training in one of 50 VA 
installations. As one would anticipate, the 
normative data indicated that there was a 
hierarchy of universities in terms of ability 
and achievement of their students, and there 
were large differences in emphases of training 
programs at the various universities and VA 
installations. 

The second section deals with the three 
types of predictor measures under study. The 
first of these was a group of predictions by 
university staff members of the success of en- 
tering students upon examination of creden- 
tials only and upon examination of credentials 
plus interview in the following areas: Aca- 
demic Performance, Skill in Diagnosis and 
Therapy, Research Competence, and Overall 
Promise as a Clinical Psychologist. The sec- 
ond type of predictor was a series of objective 
tests from which 101 measures were obtained. 
These objective tests were commonly used 
measures of intelligence, interest, and person- 
ality, among the specific tests being the Miller 
Analogies (Form G), the Strong Vocational 
Interest Blank, and the Minnesota Multi- 
phasic Personality Inventory. The third type 
of predictor measure was a series of ratings 
based upon clinical procedures, which included 
intensive interviews and projective tests, and 
both individual and pooled ratings. Most in- 
teresting was a description of a pilot assess- 
ment program in which group situational tests 


62 


were utilized in addition to other techniques. 
Factor analysis was performed to identify the 
first-order factors of the some 42 variables 
under investigation in the pilot assessment 
program. 

The third and largest section describes the 
development of criterion measures. There are 
interesting analyses of many problems en- 
countered, such as the first-order versus sec- 
ond-order and specific versus general criteria 
problems. The authors found no satisfactory 
single criterion of success, although. they did 
identify three general components of success. 
These were: intellectual accomplishment, clin- 
ical skills of diagnosis and therapy, and skills 
in social relations. They found judges agreed 
much better on the first than the other two 
components. There are many ideas and find- 
ings from which others concerned with similar 
searches for the criterion will-o’-the-wisp may 
profit. Of course, the criteria developed are 
in a sense also predictors of later performance 
as clinical psychologists. Until a follow-up 
study has been made and the criteria utilized 
in this program have been related to on-the- 
job success, the findings of the program are 
questionable. The authors do state that they 
hope to follow-up their subjects some ten, 
fifteen or twenty years later. Certainly this 
study does merit a fitting sequel. 

The fourth section presents data upon the 
degree to which the predictor measures cor- 
relate with the criterion measures and contains 
a thorough discussion of various factors which 
have an influence upon the magnitude of the 
correlation coefficients. 

The final section contains a summary of 
the major findings. . Space prohibits the re- 
viewer from commenting upon most of the 
findings presented either in this section or 
throughout the volume, which is literally 
studded with interesting findings. To the re- 
viewer it was most significant that single ob- 
jective tests predicted most of the criterion 
measures (including global measures, such as 
“Rated Overall Clinical Competence”) just 
about as well as more laborious and time con- 
suming ratings by professional staff members, 
and that single projective tests were almost 
worthless in predicting criterion measures. 

In addition, there are several appendices 


Book Reviews 


which present many of the devices utilized in 
the study and certain other important infor- 
mation, such as rejected criterion measures. 
While the general aim of the program as 
stated in the Preface was to evaluate tech- 
niques for the selection of professional per- 
sonnel, the authors do not purport to resolve 
all problems even within the limited area of 
the selection of clinical psychologists. Cer- 
tainly most of the predictive findings cannot 
be generalized to the selection of personnel for 
training in other professional areas, although 
many of the techniques should offer valuable 
suggestions to researchers. However, it 1S 
concentrated attacks of this nature which 
should eventually lead to the improvement of 
the selection of personnel for training in the 
professions. The study is must reading for all 
those working in the areas of prediction © 
professional success and of criterion research. 


Stanley E. Jacobs 


Department of the Army, 
Washington, D. C. 


Parker, W. E., and Kleemeier, R. W. Huma" 
relations in supervision. New York: Me- 
Graw-Hill, 1951. Pp. vii +472. $4.50. 


At one time or another, most personnel mo 
have struggled with the problem of improvin8 
the human relations skills of company supe- 
visory personnel so there is a great deal 0 
Interest in any text which may prove to 
useful in discussion or conference groups cong 
cerned with handling human relations p°” 
lems. j 

In the authors’ words, Human Relations s 
Supervision is “directed specifically to t 
first-line supervisor, because the establishme? 
of good human relations in any organizatl? 
stands or falls upon the skill of these SUP” 
visors in dealing with human problems.” ai 
general, the authors have succeeded in keePi"™® 
the material at this level, employing man 2 
anecdotes, illustrations, and case studies ir 
their attempt to relate human relations prin’ 
ciples to the everyday experience of sup? 
visors. One undesirable outcome of this i 
of treatment, however, is that much of n 
discussion on topics such as motivation, CY js 


seling, leadership and personal developme? 
superficial. 


Book Reviews 63 


In directing this book to the first-line super- 
Visor, the authors have emphasized the han- 
dling of problems originating with the em- 
per and do not treat directly the problems 
Of the supervisor and his impact on the work 
group, the relationships between the various 
Supervisory and management levels, or the 
effects of company policy and organization 
Structure on the supervisor. 
is least two suggestions for improvement 
ee e to mind. First, an introductory section 
E Possibly, a separate manual outlining the 
once in companies using this material 
a H with a statement as to the instruc- 
of methods employed and the outcomes 
Ba e training would be of great value. Sec- 
aa the discussion questions following each 
Boe should be reworked since in their pres- 

orm they invite class members to parrot 

ack text material. 
TR summary, the authors have done a good 
io in assembling materials for a human rela- 
ns course for men at supervisory levels. 
mest the textbook-classroom approach 
ich is indicated can or will produce the 
sited change is still an unanswered question. 


William E. Kendall 
The Chesapeake and Ohio Railway Company 


Gray, J. Stanley. Psychology in industry. 
New York: McGraw-Hill Book Company, 
nc., 1952, Pp. vii + 401. $5.00. 
M his book reflects the author’s belief that 
of Y factor which affects the production efforts 
Workers is appropriately classified as in- 
in ial Psychology. This view has resulted 
indus erent type of book on psychology. $ 
book, ry. It is, however, 4 disapp g 
eg osiderable emphasis is given human engl- 
Ogi ing, work curves, physical and physio- 
a measurements of work, fatigue, e 
NRR nutrition, rest, monotony, bore B 2l 
han ing and ventilation. Some subjects ar 
dled differently than is customary; for ex 
Ple, merit rating is discussed in a chapter 
tug ee: A five-page appendix describes z 
e Strates calculations of the mean, standar 
ti Viation, standard error of the mean, correla- 
Coefficient, and significance of differences 


Ween means, 


Although all subjects discussed in the book 
may legitimately be included in the field of 
industrial psychology, relative emphasis de- 
viates sharply from that found in actual prac- 
tice. For example, twenty pages, or five per 
cent of the entire book, are devoted to nutri- 
tion. Subjects which are usually emphasized 
are discussed only briefly; for example, em- 
ployment interviewing is handled on one page. 
Thus the book should not be interpreted as 
giving a true picture of the field as it is com- 
monly conceived. 

The book has a number of faults: broad 
statements are undocumented, superficial defi- 
nitions are used, “obviousness” is used to sup- 
port statements, flat statements are made 
which run counter to experimental evidence 
published elsewhere, broad coverage of subject 
matter results in superficiality. On the other 
side of the ledger are favorable factors such 
as inclusion of material not generally readily 
available to beginning students, uncommon 
use of common sense, and astute insights. 
Unfortunately, however, the assets do not ap- 
pear to offset the limitations of the book. 


Clifford E. Jurgensen 
Minneapolis Gas Company 


Zaleznik, A. Foreman training in a growing 
enterprise. Boston: Harvard Business 


School, 1951. Pp. 232. $3.50. 


“Ts [supervisory] training realistic from the 
supervisor’s point of view and in relation to 
his problems at work? The only way to de- 
velop an answer to this question in a par- 
ticular organization is to go to the work level, 
and to observe what is happening” (p. 232). 
The author had done just this. This book is 
concerned with the evaluation of a foreman 
training program in a small manufacturing firm 
through 5 weeks of intensive on-the-job study 
of one of the trainees, a newly appointed fore- 
man. Two other approaches to evaluation are 
also reported—observation of the training ses- 
sions and interviews with foremen. 

The training course evaluated appears to be 
a rather confusing hodge podge of academic 
psychology, rules-of-thumb for handling peo- 
ple, and pep talks—all of which are not un- 
common approaches to foreman training in 


64 ` sue “Book Reviews 


American industry today. In terms of being 
of value to this particular foreman in equip- 
ping him to better perform his job, the course 
was unsuccessful. . 

Despite the rather shaky design upon which 
this study is built where conclusions are drawn 
and recommendations made based on an N 
of a single foreman, this book makes a con- 
tribution. Its main value is in the convincing 
and meaningful manner in which the many 
complex relationships with which a modern 
factory foreman must deal are described. 
Pointing up how inadequately a typical pack- 

aged training program fulfilled the on-the-job 
needs of the foreman only helps to accentuate 
and sharpen the picture of the complexity of 
his job. 

There are a number of weaknesses in the 
study. The author tends to draw too many 
definite conclusions and to overgeneralize from 
his single case. Many of the conclusions and 
recommendations are colored by the back- 
ground and training of the author. For ex- 
ample, the only recommended method of fore- 
man training discussed in any detail is the case 
method. Recommendations on the kind of 
training which would have helped the foreman 
more, including such things as coaching by his 
superior and permissive rather than authorita- 
tive conferences, are not new. However, de- 
spite these shortcomings, against the back- 
ground of the real needs of a live supervisor 
on the job, the conclusions and recommenda- 
tions still are much more convincing than they 
are when they appear as mere statements of 
opinion as is usually the case. 

Because no serious reader can come away 
Hi as without a fuller realization of 
Bye ele cera developing supervisors, 

$ mended to persons con- 
cerned with supervisory training. If the book 
e broad and continuing 


type of training needed for helping the fore- 
man perform his difficult job. 


Theodore R. Lindbom 


Midland Cooperative Wholesale, 
Minneapolis, Minnesota 


Welch, J. S., and Stone, C. H. How to build 
a merchandise knowledge test. Research 
and Technical Report 8, Industrial Rela- 
tions Center, University of Minnesota. 
Dubuque, Ia.: Wm. C. Brown Company; 
1951. Pp. 21. $1.00. 


This excellent monograph concisely presents 
the methods for the development of job knowl- 
edge tests. Although the purpose is to de- 
scribe the steps in the construction of informa- 
tion tests for use in evaluating experience ° 
salespersons, the procedures are general an 
can be applied to any type of job. 

The authors do not claim that they are de- 
scribing any new methods. What they have 
done is to bring together for trade tests- 
procedures for item development, item valida- 
tion, test validation, cross validation, and the 
setting of critical scores, in a most clear ant 
logical fashion. The rationale for each step ! 
well outlined. The monograph is liberally 
documented with judiciously chosen illustra- 
tions, so that each step is readily understa” ri 
able. 

The monograph will not only serve a 
technical manual for those concerned wie 
selection problems, but should be an inva E 
able piece of outside reading for a course ee 
test construction or in psychological measur’ 
ment. The only shortcoming is in the disc" 
sion of the types of items that might be U5?" 
in job information tests. While a reader ae 
familiar with the field will ultimately obt% 
some notion concerning the scope of poss? 1 
items, in no single section is this aspect we 
developed. $ 

Edwin E. Ghiselli 

University of California, Berkeley 


Vou. 37, No. 2 


» Journal of Applied Psychology 


: APRIL, 1953 


Function Analysis of Thirty-Two American Corporate Boards * 


Jerome G. Kunnath and Willard A. Kerr 


Illinois Institute of Technology 


Si The insight of the general public and even 
Berea stria] psychologists into the typical 
a of corporate boards of directors is 
tobably somewhat vague and inaccurate. 
mice the corporate board is an important 
recy making nerve center in industrial so- 
b y, it needs to be brought within the orbit 
Psychological research. 
ae study, profiting from the activity 
yses reported by Flanagan on laboratory 
Personnel (1), Gordon on airline pilots (2), 
PA Wagner on dentists (3), is, however, 
cussed on group rather than individual be- 
maor, What does the corporate board do 
3 its meetings? What are some of the prob- 
ir determinants of what it does? It is the 
Pose of this study to investigate some of 
€ behaviors of the corporate board. 


Experimental Design 

x uavitations to participate in a nation-wide 
to Y of corporate board activities were sent 
One board member of each of 246 cor- 
Porations. These 246 names were selected 
tie random from the “Corporation Direc- 
i, y section of Poor’s Register of Directors 
he Executives, 1950, A total of 32 firms 
tee participated. In each instance à 
ae of the firm’s board of directors com- 
ch ed an “Industrial Board Member Survey 
art which listed 21 topics of board activity 


a the following heading: Accore to 
7 usi i boards con- 
Sider iness board experience, 553 Ti 


oard r how many meetings per y 
Of t member then indicated the nu 
ae of meetings per year at W 

ce 1S considered. 
of work completed in the student resear 
Nois Ñ ndustrial Psychology Laboratory © 

stitute of Techn ology. 


mber, out 
hich each 


ch program 
f the Ili- 


65 


` 

In size, the corporations sampled ranged 
from 50 to 25,000 personnel, the median 
being 250. Sizes of boards ranged from 2 
to 23 members, 5 being the median. Mean 
ages of members of the 32 boards ranged 
from 49 to 73, the median being 58.7. The 
average number of other corporate boards 
to which the average board member in these 
firms belongs ranges from 0 to 14, the median 
being 2.6. The per cent of board members 
who also work in the operating management 
of a firm ranges from 13.3 to 100.0, the 
median being 66.6. Fourteen of the firms 
studied were in metropolitan areas (500,000 
population or more), and 18 were in. tess 
populated localities. Geographic distribution 
of the 32 firms was closely representative of 
the national distribution of American indus- 
try. When classified according to kind of 
industry, the breakdown of firms is as fol- 
lows: heavy, 7; heavy-light manufacturing 
and transportation, 12; light manufacturing, 
8; commercial (retailing, utilities), 3; f- 
nance, 2. 

The median frequency of topic considera- 
tion was computed for the 32 corporations on 
each of the 21 topics. Seven hypotheses were 
formulated as relevant to explanation of 
topic behavior variance. Objective data were 
obtained in order to make at least crude 
tests of these seven hypotheses. These hy- 
potheses pertain to metropolitan versus non- 
metropolitan locus of firm, number of per- 
sonnel in the firm, number of members on 
corporate board, per cent of board member 
overlap with operating management, kind of 
industry (most to least heavy), mean age 
of “board members, and extent of service of 
average board member on other boards. 


66 Jerome G. Kunnath 


Results 


Activity profile. As indicated in Figure 1, 
the typical corporate board in this study 
gives relatively frequent attention through 
board meetings to: future business prospects 
(4.3 sessions per year) competition (3.9 
sessions); quantity of output (3.8); dis- 
tribution (3.0); and, the business cycle 
(2.8). Relatively infrequent topics of board 
attention include: voting bonuses (1.1 ses- 
sions per year) obtaining capital (1.3); rela- 
tions with government (1.4); company morale 
(1.4); advertising (1.4); salesmanship (1.5); 
evaluation of key personnel (1.5); public re- 
lations (1.5); and salaries and wages (1.7). 
Intermediate amounts of attention are given 
to: labor relations (2.5); taxes (2.5); pric- 
ing (2.2); stock inventory (2.0); quality of 
output (2.0); relations with stockholders 
(2.0); and distribution of profits (2.0). 

Related hypotheses. The seven hypotheses 
of meaningful relation to board behavior as 
previously stated find some confirmation in 
Table 1. Metropolitan location of a firm is 
associated with certain board behaviors, par- 
ticularly treatment of relations with stock- 


Advertising 
Competition,, 
Future Business 
Prospects, s 
Business Cycle 


Labor Relations.. 


Stockholders 
TAXCBs....6,., 
Evaluati, 


c 
Distribution geal 


Profits ae 
Voting Bonuses, 


Number of Boara Meetings 


J - Median ny 
ings per year mber 


a a 
thirty-twy > ccording 


and Willard A. Kerr 


holders, voting bonuses, distribution of pro- 
fits, morale, and quantity of output. 

Size of the organization in number of per- 
sonnel is also associated with board behavior. 
Boards of larger firms give greater attention 
to future business prospects, taxes, output, 
distribution, pricing, evaluation of key per- 
sonnel, stock inventory, and distribution of 
profits. 

Size of the board itself is significantly re- 
lated with board emphasis on such topics as 
the business cycle, distribution of profits, and 
advertising. N 

Extent of board personnel overlap with 
operating management personnel is associ- 
ated with assignment of little attention tO 
advertising and pricing. . 

The heavier the industry, the less atten- 
tion does the board tend to devote to voting 
bonuses and to quantity of output. An m- 
teresting tendency also exists for the heavy 
industry boards to give more frequent atten- 
tion to labor relations. is 

Mean age of board members per board i 
unrelated to the board behaviors investigat? 
in this study. y 

The extent to which the board is co™ 
posed of members with memberships ° 
other boards is associated with greater bon 
emphasis upon distribution, quantity of es 
put, quality of output, relations with sto° 
holders, distribution of profits, advertising 
the business cycle, taxes, competition, ĉ 
company morale. 


Conclusions 


Insofar as these data are valid estimate’ 
of activity emphases of corporate boards, 
following conclusions may be warranted: jo? 

1. The frequent topics of board atten 
are future business prospects, competit pe 
quantity of output, distribution, a” 
business cycle. 

2. Moderate board attention is give” 1 
labor relations, taxes, pricing, inventa g 
quality, stockholder relations, and dist” 
tion of profits. ' y 


5 . #2 
„Š> Relatively infrequent attention 15, fg 
Signed to salarie : 


‘ ti 
: s and wages, public rel@¥ g, 
evaluation of key personnel, salesma” h 


{0 


eee eee 


Function Analysis of Thirty-Two American Corporate Boards 67 
Table 1 
Tetrachoric Coefficients* of Correlation Between Corporate Board Emphases on 
Certain Topics and Seven Referrent Variables 
$y 2. 3. 4 5. 6. ri 
Board Mean Responsi- 
No. of Overlap Age of bilities 
Metro- Per- No. on with Heavy Board to Other 
Topic politan sonnel Board Mgt. Industry Members Boards 
1. Pricing 31 So a= a -Ñ 00 38 
2. Stock inventory 32 53 —.20 04 08 —.10 16 
3. Output Quality) 30 40 .00 —.24 —.40 — 10 61 
4. Output Quantity) WS I = 35 —.60 = 21 zi 
5. Distribution 34 60 10 —.43 —.20 31 76 
6. Salesmanship =f 40 18 — 43 01 .19 39 
7. Advertising —.08 23 47 —.82 —.29 .21 61 
8. Competition o5 4 io =.25 vs 08 50 
P Future business prospects .09 -10 40 — 34 02 — 01 05 
- Business cycle 29 46 60 —.40 —.05 07 55 
w Labor relations Al 4 10 43 43 —.22 —.09 
e Public relations AL 34 —.09 —.16 —.10 10 39 
14, Relations with government 16 39 09 a B P 25 
is. elations with stockholders 68 40 18 = É : say .60 
D axes 16 63 .09 —16 ‘a —.10 51 
TA Evaluation of key personnel 08 38 au | A = Z > 
- Salaries and wages 23 .28 —.01 : ae : 30 
18, mi 29 = 28 —.34 8 33 49 
19. opp Day morale ~ S 3L  =16 25 i7 24 
. Obtain; _07 42 4 i 225 à f 
20, pp ning capital o 5 (er ee 08 ‘60 
2 'stribution of profits Jt v6 6 07 69 23 “a 
oting bonuses 07 25 a ` ` i E 


* All coefficients for which the probability of non 


ba ertising, morale, government relations, 
aning capital, and voting bonuses. 

qu In general the topics given most fre- 
ent attention by boards are those related 
mediate corporate survival, while those 


lite frequently treated topics tend to be re- 
€d either to the internal workings of the 


ean, or to special staff or usually dele- 
ed functions. 
Metr Such mentally stimulative Ee r 
Terao Politan environment, large num e: 
boar; web large board, and particularly many 
Other members who serve simultaneously of 
quent Cards are associated with eae Da 
2 Consideration of practically all o 
Pics, 
Board overlap with operatit 
h tends to be inversely related 
SY of consideration of the variou 
only notable exception (signifi 


ng manage- 
d with fre- 
s topics. 
cant at 


Men k 
que; 


„chance meaning is 95 or better are indicated in italics, 


non-chance probability of 90) to this gen- 
eralization is frequency of attention to labor 
relations, which is considered more frequently 
in the “overlap” boards. This latter tendency 
may be due in part to defensive attitudes 
(defensive of management) of board mem- 
bers who also are a part of operating man- 
agement. Insofar as this restriction of prob- 
lem consideration is a result of board-man- 
agement overlap, it may be a psycho-economic 
argument against allowing board members 
also to serve in operating management. 
These data do suggest that such overlap, 
when excessive, may interfere with the prob- 
Jem-raising and problem-solving processes jin 
corporate enterprise. 

7. Average member “responsibilities on 
other boards” is a variable which probably 
connotes experience and exceptional ability, 
It seems significant that boards so favored 


68 


place notably greater board meeting emphasis 
on competition and quality of output. It 
also seems of importance that none of the 
other six “hypothesis” variables correlates 
significantly with board emphasis on either 
competition or quality of output. 

8. Mean age of board members was not a 


significant predictor of topics emphasized at 
board meetings. 


Received June 20, 1952. 


Jerome G. Kunnath and Willard A. Kerr 


References 


1. Flanagan, J. C., et al. Critical requirements for 
research personnel: a study of observed be- 
haviors of personnel in research laboratories- 
Pittsburgh: American Institute for Research, 
March, 1949. 

2. Gordon, Thomas. The development of a method 
of evaluating flying skill based on an analysis 
of the critical requirements of the airline pilot’s 
job. Unpublished Ph.D. dissertation, Univer- 
sity of Chicago, June, 1949. t 

3. Wagner, R. F. Critical requirements for dentists. 
J. appl. Psychol., 1950, 34, 190-192. 


THE Journat i 
Vol. 37, No. ia Psycno.ocy 


The Curve of Output as a Criterion of Boredom * 


Patricia Cain Smith 


Cornell University 


The purpose of this study was to investi- 
a the relationship between the experience 
or io and changes in rate of output 
work ape of production curves for industrial 
Beh The classic investigations of the 
(5 e Industrial Fatigue Research Board 
textb 7,8, 9) have satisfied the writers of our 

es the experience of monotony or 
chan om is characteristically accompanied by 
the ges'in the rate of output, and even that 
euni ture of the worker’s experience may be 
Cury ified by examination of the shape of the 
work. of output. A re-examination of the 
Nece, of the British investigators was made 
acceptably by certain deviations from normally 
whi Ptable methods of scientific investigation, 
ich will be discussed later in this paper. 
a S early as 1941, Roethlisberger and Dick- 
Th failed to duplicate the English results. 
ey stated: “With respect to the monotony 
pothesis, no definite conclusion could be 
to ee A curve resembling what is claimed 
cou e a typical monotony curve was not en- 
Ntered except in the case of Operator 1A. 
a clearly understood, however, that 
Baay in work is primarily a state $ 
Out; and cannot be assessed on the basis 0 

Put alone” (2, p. 127)- E 
of a 1946, Rothe undertook an investigation 
tec se characteristics of production data, 

< Bnizing their importance as crit 
aty of industrial investigations. He 
t that individual daily work curves m 
Ce Yy of many different forms and d9 
tern», 2ny characteristic, predicta i 
Cur (3, p. 209). Correlations of wo 

nyes for the same operators for different 
nae Varied widely, the median correlation 
& approximately .05. Rothe average 


* Th; 
a ane Paper is a portion of a dissertation pea 
Bree “tal fulfillment of the requirements 6 iyersit 
The wri doctor of philosophy at Cig; University- 
i Ater is A r. 1. a- $ 
ang widance, Er indebi ement union officials, 
Made grkers whose active cooperation 30 
© study possible. 


iteria in a 


69 


work curves for each worker for one week, 
and obtained trend lines which were classified 
by inspection. Four of these curves were 
“mixed curves,” two were “fatigue curves” 
and two were “monotony curves.” 

Rothe was interested in determining 
whether knowledge of the production curve 
for any individual or group for a specific 
work period would permit prediction of the 
characteristics of future work curves. Neither 
he nor Roethlisberger and Dickson attempted 
to relate the shape of the work curves which 
they obtained to the experience of the in- 
dividual worker. Rothe’s study, moreover, 
was performed using hourly-paid workers 
whose work flowed in a continuous and un- 
interrupted manner, so that his results could 
not be directly applied to the very different 
incentive conditions obtaining for piece-rate 
workers whose work is grouped into lots or 
bundles. 

Since the existence of any convenient overt 
indicator of the psychological state of the 
worker would be of obvious practical im- 
portance, and would be highly useful for re- 
search purposes as well, the present investiga- 
tion of the relationship between reported 
boredom and changes in the curve of output 
was undertaken. Also included in this in- 
vestigation were such other proposed be- 
havioral indices of boredom as talking, varia- 
bility of production, and frequency of volun- 
tary rest pauses. 

This study was conducted in a small knit- 
wear mill in northern Pennsylvania. Most 
operators in the mill, and all operators studied 
in detail here, were paid by piece rate. Two 
operations were chosen for observation. Both 
were: (1) short enough so that variations in 
production would show up in the output 
curves; (2) long enough to permit timing of 
several operators at once; (3) performed in 
a uniform manner by several experienced 
operators; and (4) largely manual, so that 
the operator rather than the machine de- 


12 


Patricia Cain Smith 


a. Monday 


2.0 
1.5 


GARMENTS SEWED PER MINUTE 


7:00 8:00 9:00 


Fig. 1. 


t agree to a 
the groupin: 
either work 


t Moreover, 
judge agreed with Or recreati 
eational 


10:00 11:00 


b. Tuesday 


c.Wednesday 


d, Thursday 


e. Friday 


I2! 100 2:00 3:00 4:00 
TIME 


Out; ji 
Put curve for one week—Operator 7A: Hemming Operation. 


social groupings of the workers. 
these relationships was significantl 
from chance when tested by the chi-s4 
test. Neither in the work nor in the 
tional groups was there any evidenc é 
work curves of members resembled ie 
other for the same period of work. 


of 


Nove it 
efer? 
y die 


y 
rect y 


w 


The Curve of Output as a Criterion of Boredom 73 


It seemed likely that another operation 
Would yield more traditional results. One 
was chosen, therefore, in which there were 
two portions to the task, which could be ob- 
Served and timed separately. The job was 
called taping. Two short stiffened pieces of 
Cotton tape were sewed on the unhemmed 
bottom of the shirt. After the operator fin- 
ished sewing the bundle, she cut the threads 
and folded the shirts. “Again the operators 
ete timed continuously for one week, and 
an ng characteristic curves were found 

`" cutting, for sewing, or for the two com- 
ee Similarly, there seemed to be no rela- 
‘onship between any daily work curve and 

€ reported feelings of the operator. Again, 
E only operator who produced either ascend- 

8 or U-shaped curves reported that, despite 
i numerous disadvantages of her work and 

© company, she was certainly not bored. 
n US the production curve criteria proved 

Bt only unreliable, in that observers a 
aE U assification of curves, but 
Invalid as a classificat 

ne of the major difficulties in the use of 
elution curves as criteria lies in the analy- 
tic D data. There are no satisfactory statis- 
of A measures available for the comparison 
res lk, shapes of curves. Interpretation of the 
A fs of correlational analyses is sometimes 
trib © difficult by the peculiarities of the dis- 
eons involved. Moreover, the correla- 
ari Coefficient cannot allow for over-all simi- 
ities in the shapes of the curves, when the 
Ee of slope are displaced slightly in 
Specti The alternative method of visual in- 
reli. tion is subjective and, apparently, un- 
m able, Both methods show little agree- 
it vt ftom day to day, so that even though 
reflect be demonstrated that daily Lat 
Worker’ the experience of the individua 
lop CTS, it would be highly unlikely that any 
het relationships could be a 
pe; d. Their use in this kind of situatio 
sed to be impractical. 


Other changes in behavior which have been 


rela a 

in ted to boredom include frequen Ea 
> irequen uses, variabili 

tate cy of rest pa ; ‘Nee woos 


ing °! Working, and average sPee A 
Withj t was possible to rank the worke 
m each group on each of these factors 


and on intensity of boredom symptoms, esti- 
mated from both questionnaire scores and in- 
terview responses. The rankings were com- 
pared and the relationships tested for sig- 
nificance by Kendall’s non-parametric tau 
test (1, 403-408). No significant or even 
consistent relationship appeared between the 
boredom symptoms and the proposed indices. 
Reliability of the behavioral indices was esti- 
mated by comparing total rankings for each 
worker on Monday, Tuesday and Wednesday 
with the totals for Thursday and Friday. 
All of these relationships proved significant 
at the 5 per cent level or better by Kendall’s 
tau test. Individual differences were, there- 
fore, reasonably stable throughout the week. 
` 


Discussion 


Why were these results so different from 
those of the British Industrial Research 
Board? In the first place, comments of the 
workers showed that each had her own con- 
cept of the number of bundles that she should 
complete in a day. If she was behind 
schedule, she hurried toward the end of the 
day; if she was ahead, she slackened speed 
or stopped entirely. One operator, who had 
just completed all but one of her customary 
bundles for the day, commented, “You’ve 
seen how fast we can do them. Now do you 
want to see how slow?” Production figures 
reflected quite clearly what the workers con- 
sidered to be the proper pace for them at 
that particular time, but not at all neces- 
sarily the way they felt about their work. 

It has been the observation of the writer 
that such pacing of work occurs with much 
greater frequency in industrial situations than 
does spontaneous variation in rate. Even 
when there is no restriction due to fear of 
rate-cutting, it is normal for any worker to 
decide in advance how much he will produce, 
and earn, each day. Effort is unquestionably 
pegged, at least within narrow- ranges, in 
most industrial situations. 

A careful re-examination of the English 
studies suggests several differences in method 
which perhaps further account for the dis- 
crepancy between our results and theirs. The 
most serious has already been mentioned; 
they included in their criterion items which 


74 


were related to changes of rate of working, 
and weighted these items in the direction 
favorable to their hypothesis. The reader is 
not told, moreover, whether or not their 
curves were classified without knowledge of 
the accompanying verbal reports. Several 
other factors apparently operated to make 
the shape of their curves more consistent 
from day to day. Although they do not 
specify the kinds of jobs involved, one would 
infer from comparison of the various re- 
ports that at least six different operations 
were involved, with various hours of work 
and methods of payment. Such variations in 
jobs and conditions would tend to mask in- 
dividual variability. 

One last factor should be noted. There is 
no indication in any of their data of volun- 
tary rest pauses, even for rest-room visits. 
If decreases in production due to such work 
stoppages were averaged into their curves, 
this procedure would account for the con- 
sistency of the curves from day to day, as 
well as for the preponderance of U-shaped 
curves, since rest-room and water fountain 
visits tend to be made at about the same 


time every day, and mostly in the middle of 
the work period. 


Summary 


_ Continuous observation of two groups of 
eight women each, operating power sewing 
machines on light, uniform and repetitious 
work, led to the following conclusions: 

1. There were fairly stable individual dif- 
Tences in speed of working, variability of 
Production, frequency of rest Pauses, and fre- 
quency of talking. 

2. These differences showed no consistent 
relationship to the reports of the workers 
concerning their feelings of boredom or mo- 


fe 


3. No shape of work 
which would 
worker, 

4. Work curves 
cial groups showed 
with each other, 

5. The approach o i 
a Noticeable effect on i ee 
of the workers, The d 


- curve was found 
characterize the individual 


for individuals forming so- 
no observable relationship 


g hour had 
e production of many 


irection of the change 


Patricia Cain Smith 


in rate which appeared at the end of the day 
was determined by the concept of a day's 
work held by the worker. n 

6. Boredom is not necessarily accompanied 
by a depression in the curve of output, nor 15 
a sag necessarily accompanied by feeling 
boredom. , sh 

7. Output curves should be viewed va 
caution as indications of the subjective fee! 
ings of the worker. f 

There can be little quarrel with the claim 
of the British investigators that, other fac- 
tors being equal, workers tend to slow dona 
talk, become restless and variable in e 
production when bored. In most indusa 
situations, however, one cannot assume t E 
all other factors are equal, and many of b 
factors may heavily outweigh the influen 
of interest or boredom in producing chang 
in working behavior. 


Received May 28, 1952. 


References a 
slits: 

1. Kendall, N. G. The advanced theory of stalisrys 
London: Charles Griffin and Company: 
Vol. 1. 

2. Roethlisberger, F. J., and Dickson, Ww. J. 
agement and the worker, Cambridge: 
vard University Press, 1941. wra 

3. Rothe, H. F. Output rates among butten 3 
pers: I. Work curves and their stability 
appl. Psychol., 1946, 30, 199-211. 

4. Rothe, H. F. Output rates among batan 
pers: II, Frequency distributions an outptt 
pothesis regarding the “restriction © 
J. appl. Psychol., 1946, 30, 320-321. ; of 

5. Vernon, H. M., Wyatt, S., and Ogden, < etit“ 
the extent and effects of variety in P 
work. Industrial Fatigue Research Bo! a 
port. No. 26, 1924. jes 

6. Wyatt, S., assisted by Fraser, J. A- studi rest 
repetitive work with special reference dP 
pauses. Industrial Fatigue Research A 
port. No. 32, 1925, oe 


pi 


J 


wrap” 
ny, 


= 
a 
= 
Lo 
SE 
A 
ae 
mn 
8 
a 
i] 
a 
5 
A 
3 
5 
Ta 
= 
i 
Lo 
i 
a 
g 
A 
g 
ka 
= 
Ss 


and uniformity in work. Industrie’ g ch 
Research Board Report. No. 52, n sto 
8. Wyatt, S., and Fraser, J. A., assisted A woh 
F. G. L. The effects of monotony? peb 
Industrial Fatigue Research Boar E 
No. 56, 1929. by Stogi 
Wyatt, S., and Langdon, J. N., assisted petit 
F. G. L. Fatigue and boredom P goar 
work. Industrial Health Research 
port. No. 77, 1937. 


THE JOURNAL or A . á 
Vol. 37, NG Sie Psycnorocy 


Predicting Success in Elementary Accounting 


O. R. Hendrix 


Office of Student Personnel and Guidance, University of Wyoming 


A recent article by Traxler regarding the 
oa ef objective tests for the selection of per- 
ee ìn the professional field of accounting, 

„courages further investigation of the suita- 

ad of a number of tools for the prediction 

Success in college courses in accounting. 
"a Study represents a preliminary investiga- 

on of the relative validity of Form C of the 

a ien Institute of Accountants Orienta- 

Gee Test (AIA), the 1947 Edition of the 

7 rican Council on Education Psychologi- 
_ Examination (ACE), Form 23 of the 

(ost State University Psychological Test 

St U), and the accountant scale of the 

rc Vocational Interest Blank for Men 

eo for predicting success in elementary 
unting at the University of Wyoming. 

In the fall of 1949, the AIA Orientation 

a (Form C) and the Strong Vocational In- 

ae Blank for Men were administered to 
ar freshmen students enrolling in elemen- 

Y accounting in the College of Commerce 

cor Ndustry at the University of Wyoming. 

Were » on the other two tests mentioned above 
ent already available for most of these stu- 
est S and it was possible to secure the four 

Scores and accounting grades for 95 stu- 

ents out of the 110. Of this number 76 

<© men and 19 were women. 
tatistical constants for the four tests and 

Tenitetion are presented in Table ' 

ntercorrelations for the five variables in- 

Table * iain computed and are contained in 

front’ highest coefficient of correlation, 84, 
pe tat between OSU and ACE. men 

abjy., C¥Pected since both are tests of gene. 

A ma to do college work. A substantial 
test, ‘Onship is also noticed between these am 
ica AIA Orientation Test. No np 
SVr nal relationship seems to exist pan 

B and the other three tests although the 
Tr: 


a 
ia'i field of 
accou ler, A, E jective testing 12 the w 
439, Ating, Educ. Pui Measmt., 1951, 11, 427 


75 


coefficient of correlation of .18 between SVIB 
and AIA Orientation Test is of interest. The 
standard error of .18 was .099, indicating 
significance between the .05 and .10 levels, 
The most interesting revelation is that both 
ACE and OSU seem to be more closely re- 
lated to grades in elementary accounting than 
is AIA Orientation Test. This is especially 
interesting in view of the fact that AIA Orien- 
tation Test is intended to be “a general in- 
telligence test slanted toward business.” 2 

Multiple correlations were computed be- 
tween all possible pairings of the four tests 
and accounting grades. Table 3 lists the cor- 
relations obtained. If the prediction tools 
are to be limited to two out of the four tests 
considered here, it would seem that the best 
two combinations would be ACE and SVIB, 
or OSU and SVIB. Again the interesting 
revelation is that AIA Orientation Test is 
not to be found in either of the best two 
combinations of two out of four tests. 

Multiple correlations were also computed 
between all combinations of three out of four 
tests and accounting grades. These correla- 
tions are recorded in Table 4. * 

The best combination of three tests for 
predicting success in elementary accounting 
is apparently ACE, OSU and SVIB (Account- 


Table 1 


Means and Standard Deviations of Test Scores and Fall 
: Quarter Grades in Elementary Accounting* 


Variable Mean San. 

ACE Psychol. Exam. 110.5 18.3 
OSU Psychol. Test 67.9 24.3 
AIA Orientation Test 33.9 11.0 
Strong Interest Blank, Acctg. Key 36.9 11.0 
2.9 1.0 


Elem. Accounting Gr.t 


+N = 95. P 
Grades given at the University of Wyoming are as 


tires: I (A), I (B), HI (C), IV (D), and V (Failure), 
2 Ibid., p. 428. 


76 O. R. Hendrix 
Table 2 
Intercorrelations of Test Scores and Grades in Elementary Accounting for the Fall of 1949* 
Me 
OSU AIA Strong Elem, 
Psychol. Orient. Interest Accounting 
Test Test Blank Gradest 
ACE Psychol. Exam. 84 66 00 se 
OSU Psychol, Test 52 = iid ~ 
AIA Orient. Test 18 ž y 
Strong Interest Blank, Acctg. Key 2 
a — 


*N=95. 


i i iled 
t Due to the grading scheme employed, e.g., A = 1, B = 2, etc., these coefficients of correlation as comp! 
were all negative, but are listed here as positive since the true sense of the relationship is positive. 


ing Key). Again it is interesting to note that 
this is the one possible combination of three 
out of the four tests that does not include 
the AIA Orient. Test. Addition of AIA 
Orient. Test to the cluster of three tests did 
not appreciably increase the predictive value 
of the cluster. (Both R’s were .55 when 
rounded to two decimals. Theoretically the 
introduction of an additional variable into a 
cluster will always increase R, but the in- 
crement in this instance was so small that 
it is not observable when the R’s were 
rounded.) 

The findings are all based upon the as- 


Table 3 


Coefficients of Multiple Correlation Between Various 
Pairings of Test Scores and Grades in 
Elementary Accounting in the 
Fall of 1949* 


Pairs of Test Scores R 


ACE Psychol. Exam. and 
Strong Interest Blank, Acctg. Key 


St 
OSU Psychol. Test and 
Strong Interest Blank, Acctg. Key 48 
ACE Psychol. Exam. and 
AIA Orient. Test Al 
OSU Psychol. Test and 
ACE Psychol. Exam. 39 
OSU Psychol. Test and 
AIA Orient. Test 39 
AIA Orient. Test and 
Strong Interest Blank, Acctg. Key 39 

SS SP iP 


N=95. 


sumption that accounting grades are an a 
ceptable criterion for judging the relativ" 
validity of the tests under consideration 
While grades are known to be not as relian 
as is desired, they are the criterion of pes 
formance most generally used in colleg 
courses. ; sted 
It is obvious that this study is restra 
to the relationship between the test SCOT” 
considered and grades. It does not tt 
sarily follow that the same relationship ie al 
between the test scores and success in © 
employment in the field of accounting: q- 
For example, it is possible that many ©. 
lege professors weigh mastery of the theor? 
cal aspects of accounting more heavily diog 
practical skills in accounting when aware n- 
grades. On the other hand, success 1? el 
ployment in accounting may be more ee of 
related to the practical skills. Theses oy 
course, are hypothetical assumptions, bu clu 
do illustrate the danger of drawing cor ae 
sions from this study concerning the 


Table 4 


; all 
Coefficients of Multiple Correlation Betwee? < 
Possible Combinations of Three and Four 
out of Four Tests and Grades in 


Elementary Accounting 
+ . F ? 
Combinations of Test Scores 


5 
ACE, OSU, and SVIB i 
ACE, OSU, SVIB, and ATA Orient. Test S. 
CE, SVIB, and ATA Orient. Test 8 


OSU, SVIB, and ATA Orient. Test “ 
ACE, OSU, and ATA Orient. Test 


Predicting Success in Elementary Accounting 77 


tionship between AIA Orientation Test and 
Success in employment in accounting. 


Summary 


_1. If a single test is to be utilized in pre- 
dicting grades in elementary accounting, ACE 
Psychol. Exam, and OSU Psychol. Test are 
Preferable to the AIA Orientation Test. 

2. If two tests are to be used, neither of 
the two best combinations of two out of four 
tests includes the AIA Orientation Test. 


3. If three out of the four tests are to be 
used, the best combination of three does not 
include the AIA Orientation Test. The ad- 
dition of AIA Orientation Test to the cluster 
of three does not improve the predictive value 
of the cluster. 

4. It does not necessarily follow that the 
same relationship would be obtained if the 
criterion used were success in professional em- 
ployment as an accountant. 


Received May 28, 1952. 


THE JOURNAL OF APPLIED PSYCHOLOGY 
Vol. 37, No. 2, 1953 


An Index of Selective Efficiency (S) for Evaluating a Selection Plan i 


William Leroy Jenkins 
Lehigh University 


Suppose a selection plan has been validated 
and the multiple R turns out to be about .60. 
What does a validity coefficient of this size 
indicate about the selective efficiency of the 
plan? 

The index of predictive efficiency (Æ) is 
not a satisfactory measure. For an R of .60: 


E= 100 (1 — VI = 7) = 20% 


which represents the per cent improvement 
over chance in predicting individual criterion 
scores. But ordinarily a selection plan is 
designed merely to pick a group of successful 
workers and to eliminate a group of unsuc- 
cessful workers—not to predict the criterion 
score of each individual. 

What we need is an index of selective ef- 
ficiency that will indicate how well we can 
pick such groups. Particularly we are inter- 
ested in accepting as many as possible of the 
potentially superior workers and rejecting as 
many as possible of the potentially inferior 
workers. Let us call the highest quarter on 
the job criterion “superior workers,” and the 
lowest quarter on the job criterion “inferior 
workers.” The middle half will be “mediocre 
workers.” Then: 


Successes of the plan are superior workers 
accepted and inferior workers rejected. 


Failures of the plan are superior workers re- 
jected and inferior workers accepted. 


Suppose we have two hundr 
and choose half of them with the aid of a 
brown Stetson hat. If we obtain job cri- 
terion scores for all of them, we 


ed applicants 


0 can expect 
to find something like this: 
Inferior Mediocre Superior 
Accepted 25 50 25 
Rejected 25 50 25 
Successes: 25 + 25 = 50 
Failures: 25 4+ 25 = 50 


78 


Selection on a chance basis leads in the 
long run to an equal number of successes 
and failures. 

But suppose we use a selection plan hav- 
ing a validity coefficient of about .60. If we 
obtain job criterion scores for all two hun- 
dred men, we should find something like this: 


Inferior Mediocre Superior 
Accepted 10 50 40 
Rejected d 40 50 10 
Successes: 40 + 40 = 80 
Failures: 10 + 10 = 20 
80 = 20 _ Goo 
Improvement over chance: 30 + 20 o 


With any actual sample of 200 applicants, 
the figures might not come out in this exact 
symmetrical pattern but the per cent wi 
provement over chance should be substa? 
tially the same. i 

With the aid of a chart for computing 
tetrachoric r* it is possible to determine t j 
theoretical improvement over chance oo 
responding to any obtained value of R f° 
any proportion of total applicants acceptor 
Some typical values of the index of selecti” 
efficiency (S) are shown below: 

Proportion 


0 
Accepted R=.50 R=.60 R=.70 RŽ A 
One-third 18% 57% 66% i 
One-half 52% 63% 74% pote 
Two-thirds 48% 57% o6% 10% 


For all practical purposes we may a 
the index of selective efficiency (S) 45 


Same numerical value as the validity cool 


cient, if we are accepting something bet g 
one-third and two-thirds of the applicar s 
In our experience, the index of salen 
efficiency (S) has proved a useful way 
explaining the meaning of a validity b 
efficient to someone who is unfamiliar W 
statistics. 


Received June 16, 1952. 


2 Jenkins, W, L 


tafa 
» A si achorie 
Educ. psychol. Meas single chart for tetr: 


mt., 1950, 10, 142-144. 


Ol ee 


À 


THE JOURNAL or Ai s 
Vol. 37, No. 2, igegeeeD PsycuoLocy 


A Note on Techniques in the Investigation of Accident Prone Behavior * 


Lawrence L. LeShan 
Roosevelt College 


In the past several years there have ap- 
Peared, in the psychological literature, a large 
number of studies of accident proneness. 
h any of the articles which have appeared 
ave lost some of their potential value due 
to a lack of clarity concerning the special 
Problems of technique which exist in this 
field. Tt is the purpose of this paper to point 
Up a few of these problems.* 


Method 


The usual method of finding a population 
Of accident prones includes either an inter- 
Mew technique or a survey of accident records 
S an industrial organization, a police file, in- 
urance records or some source of this sort. 

ach of these has dangers attached to it. 

he interview, Accident prones have a 
Strong tendency to “forget” accidents. A few 
examples may serve to illustrate this. 
ne man revealed half a dozen major acci- 
dents, Intensive interview probing found no 
Others, At the end of the interview, he was 
asked to strip, and his body was examined 
Or scars. A previously undisclosed scar on 
© right side of his chest was called to his 
attention. He then remembered that three 
Years previously a bulldozer had rolled over 
tie injuring his back and breaking three 


Another man was leavi 


ng the interview 


oy ‘ 
Paper he authors accept full responsibility ar 
an; t. Tt is a pleasure, however, to han! em 
oe Research Director of the Nationa | ue 
Points.” for raising and clarifying many 
in this problem as 


ncerning the 
ch co. E e 


Tats, 
R nant authors became interested 
Psycho qof being involved in reseat 
{{ynamics of individuals 
K (The results 
erp in Psychiatry: Journal for the ge 
13-80 "Sonal Processes, Vol. 15, No. Js , 
S us part of this research GE ie 
ere interviewed by O eit 
of =)~ The other part of the study one 
Pro; Nalyses of projective tests oP sixty-five One 
an, mes and seventy-five equated non-accident-P Ser 
Aceig PProximately twenty intensive iou Aye: 
of this Prones) was completed by the o 
1S paper, 


and 


79 


Jim B. Brame 
University of Houston 


room when the examiner (JBB) noticed he 
had a bent distal phalanx of the right little 
finger. On inquiry the patient said he “just 
remembered” that he broke that finger the 
previous year. 

A twenty-one year old male with several 
accidents denied any further accidents dur- 
ing thirty minutes of detailed questioning. 
Towards the end of this period he started 
rubbing his right elbow. On specific ques- 
tioning, he recalled that he had broken his 
arm when he was eighteen. 

Behavior of this sort is by no means in- 
frequent. In the experience of the authors, 
it is the general rule rather than the excep- 
tion.? | 

For this reason, a great deal of skepticism 
must be attached to results gained by the 
written questionnaire method also. An ex- 
perience of one of the writers (LLL) illus- 
trates this point. A questionnaire was filled 
out by 40 accident repeaters who had been 
called into a state driving clinic. They had 
all had at least three auto accidents in 12 
months. Over half of them did not remem- 
ber all three accidents. 

When an interview technique is being used 
to obtain an accident history, the subject 
should be questioned on a year-by-year basis. 
This would include the jobs worked at and 
the particular hazards of each job; vehicles 
driven, repairs and their cost; sports par- 
ticipated in, falls and bruises. Special refer- 
ence should be made to burns and scalds 
since these are not often thought of as 
“accidents” by the subject. 

The interview frequently makes people de- 
fensive about their accidents record as they 

2 itative estima arge 
aoe TE A ienis. gy mantai Toet 
can be made, since we do not know how many ai 
dents were not recalled at all in our interviews 
However in thirty-five interviews, at least thirty of 
the subjects recalled several more accidents afte: 
careful probing than they had when simply asked a 


list all the accidents that they had had and th 
given plenty of time and a sympathetic Mt a 


80 


may see implications of punitive intent. 
They may, therefore (in addition to the acci- 
dents that they have repressed), deliberately 
not state others which they do consciously re- 
member. For this reason, careful attention 
must be paid to the psychological atmosphere 
of the interview. A good relationship is es- 
sential to accurate data collation. We feel 
that, by and large, an authoritarian relation- 
ship tends to produce markedly less data than 
an egalitarian one.* A procedure that is often 
helpful is to express interest in the general 
health history and to record all illnesses. 
One point about the interview which should 
be considered in research design is that it is 
essential to gather data on control groups in 
the same manner it is gathered on groups of 
accident repeaters. An intensive probing in- 
terview covering the entire life-span of the 
individual produces a surprisingly large num- 
ber of accidents in the general run of the 
population. Since a definition of accident 
Proneness implies that the individual con- 
cerned has a higher accident rate than his 
peers, both experimental and control groups 
have to be evaluated with the same technique. 
The use of accident records. Probably the 
most common technique for studying acci- 
dent-prones is to use the data of the safety 
departments of police or insurance firms, in- 
dustrial firms, etc. There are several dangers 
inherent in this method, two of which might 
be mentioned briefly. 


Although this is probably a valid way of 
of collecting data on experimental groups, it is 
-a dubious procedure for control groups. We 
do not know how many individuals are a 


dent prone at home and not at work. 
man has a hi 
a low on-duty accident record and we study 
him as a non-accident prone since we have 
only the plant statis 


tics, he is likely to con- 
fuse our data, to say the least. 


We know so little about the accident- 
w if he is more or 
cidents to the plant 
e, if he tends to 

of accidents, etc, 
ree aa mari is not the result of any experi- 


but simpl i i 
Detience Viri piy an impression based on ex- 


varied types of interviews, 


cci- 
Ia 
gh off-duty accident record and 


Lawrence L. LeShan and Jim B. Brame 


Defining an Accident 


This is a complex and difficult problem. 
Generally we consider an accident to be @ 
mishap with a sudden onset. However, this 
by no means solves our problems. Pareng 
thetically, it might be stated that the Work- 
men’s Compensation Act of the State of vi 
ginia has a four line definition of “ee 
which is followed by seventeen single space 
pages of clarification in fine print. 

We have little clear understanding of the 
difference between a disease and an accident. 
If a workman habitually neglects washing his 
hands after he finishes work with coal 
products and develops a skin irritation ye 
incapacitates him, how does this differ ee, 
typical accident-prone behavior in which th 
individual injures himself by neglecting 
elementary safety precautions? Should W 
count this as an accident? Peril 

Even though we eliminate occupation a 
disease and use only traumatic injury, ot if 
problems arise in the same area. We see E 
report of a man who has 15 back-spratm 
The medical report states he has a Mie 
back.” Is each sprain to be counted aS HE 
accident? Is there a difference between 
man who has this particular disorder mi 
(granted freedom of choice) repeatedly gf 
tates to jobs calling for heavy lifting a” af 
similarly handicapped man who takes Lae 
tions which will not put such a strain 0n 
back? d 

The difference between a chargeable as 
non-chargeable accident is often used in “ai 
ies but is frequently more apparent than ” en 
Surveys of trucking company records by in 
of the writers (LLL) have shown that 
dividuals who have high rates of charger 
accidents tend also to have high rates of 5 ich 
chargeable accidents, Many accidents i - 
appear to be non-chargeable on superficia on 
amination are chargeable when carefully oul 
amined. One accident prone had had jp 
automobile accidents while he was sitting 
the front seat of a car and someone else ee 
driving. He had, he said, “generally P jp 
talking to the driver when it happened. pow 
one of the accidents he had hurt his el r 
badly as it had been outside the ventil@ 
window when the car crashed. This st 


a 


e 


te 


A Note on Techniques in the Investigation of Accident Prone Behavior 81 


Se plates the interviewer to probe at 
ees oe inte exactly what had happened. 
quite ain minutes a picture emerged that was 
able” i erent from the earlier “non-charge- 
baide tne It is true that he had been sitting 
the ee e driver, but he had decided to clean 
eal ater He thrust his hand with a 
MPH rough the ventilator window. At 50 
iost Aes towel flapped over and covered 
ae the windshield, the driver was blinded 
A e crash occurred. 

ane type of problem in defining the 
ad nt is illustrated by an individual who 
A no history of injuries or accidents (as 
Vai are usually defined). However, in- 
a gation showed that he had been fired 
cae his last position (as a pharmacist) for 
A aa mistakes.” At the time he was seen 
man he working as a pilots’ mechanic. _ This 
ik ad no automobile crashes or falls in his 
in eee cead. He simply made minor errors 
ie = of such a nature that the errors could 
kit isastrous effects. Definitions of acci- 
made for a particular study should 


c . 
ey exclude or include individuals of this 


General Considerations 


den eet are no agreed-on definitions of “acci- 

stud, „ uty,” or “accident prone.” Each 

to ne first decide what it is attempting 

as out. In terms of various factors such 

EE Opulation studied, purpose of research, 

haga wes available, etc., definitions can be 
e. 


des perhaps, can be most 
eral ng the accident-prone. 
den agreement that he should have an acer 
Tate higher than that of his peers: but as 
e 1 far above the mean of his peer group 
Riese be there is no agreement. Shall Y 
label t the upper 1% of our population an 
the them “accident-prones,” oF shall we use 
Soy, p Per 5%, the upper 2570, OF the upper 
'h here is no agreement here. 
othe © problem can be approached in an- 
t way. Rather than examine (by im- 


Dlicats 
eny 20) the accident liability of the DE 
in i Onment we are studying, 4S was imp. 


acide, last paragraph, we can examine E 
ther 2t liability of the individual. We cai 
use criteria such as one accident pet 


clearly seen in 
There is gen- 


proneness and 


year for at least 5 years, or 3 accidents every 
2 years for 10 years, etc. In this way the 
State of Oregon labels a man an “accident- 
repeater” if he has had 3 accidents in any 
12-month period. This only includes 4% of 
state drivers, but the 4% have 40% of the 
accidents. (Unlisted memo. in the files of 
the National Safety Council.) 

A problem here is that the total accident. 
record of some individuals is not consistent 
by a year-by-year, or even decade-by-decade, 
analysis. Often a person may normally be 
non-accident-prone but for a period of two 
to five years show high accident rate and 
then at the end of this time, return to his 
former low level of accidents. 

In the design of research, it may be un- 
wise to use as controls only individuals with 
low accident rates. There is no evidence 
that this is not a special group with different 
characteristics than are found in the normal 
population. Until this problem has been in- 
vestigated, controls should be taken from the 
center of the accident curve rather than from 
the lower extremity. 

Good research design will demand that ac- 
count be taken of both the accident liability 
of the specific environment and the liability 
of the individual. Fleming and Dickinson’s 
excellent paper * discusses the relationship of 
personal and situational liability. They state, 
in part, “A high accident potential and an 
accident-prone driver make for a high acci- 
dent expectancy. A high accident poten- 
tial and a normal driver make for an accident 
possibility” (p. 171). No study which does 
not evaluate both the individual and the 
group accident rate can expect to produce 
clear cut results. 

Summary 


Special problems exist in the design of 
studies of accident prone behavior. A few 
of these are briefly discussed. Difficulties in 
finding the accident-rate of an individual 
defining an accident, and delimiting accident- 
rone and the non-accident-prone groups are 


pointed out. 
Received May 26, 1952. 


47. Fleming, Jr, and J. J. Dickinson. Acci 
I accident law. Harv. law R tee 
> 


63, 169. 


THE JOURNAL or Appiiep PsYCHOLOGY 
Vol. 37, No. 2, 1953 


The Efficiency of the Minnesota Teacher Attitude Inventory for 
Predicting Interpersonal Relations in the Classroom * 


Robert Callis 


University Counseling Bureau, University of Missouri 


Our main problem in this study is to test 
the efficiency of a measuring instrument to 
predict the ability of a teacher to effect 
harmonious interpersonal relations in the 
classroom. We believe that harmonious in- 
terpersonal relations in the classroom are de- 
sirable. We also believe that the teacher is 
a key figure in the kind of relationship that 
prevails. If good interpersonal relations are 
obtained between teacher and students, then 
it follows that the teacher and students will 
work together in a social atmosphere of co- 
operative endeavor and with a mutual feeling 
of security. Also the students will be mo- 
tivated to learn the material at hand more 
easily, and will have an Opportunity to do 
so in a manner which is most efficient for 
them individually. If, on the other hand, 
the social climate in the classroom is char- 
acterized by tension, fear, and submission on 
the part of the students, the student is apt 
to have little motivation to learn; and, as a 
by-product, numerous disciplinary problems, 
inattention, and restlessness will result, If 
there is mutual distrust and hostility between 


the teacher and students, probably little 
learning will occur, 


We have assumed 
tudes resultin, 


creates in his 
e presumed to 
factors such as 
intelligence, gen- 


ching skills, 
* This study was a part of the Un: 


If we 


82 


are able to measure these attitudes aa 
torily, we then should be able to sg 
a significant degree the kind of relations E 
which will be obtained in the class onm 
Specifically our problem is: How well H. 
the Minnesota Teacher Attitude Triveni 
(MTAI) predict interpersonal relations 
the classroom? 


Procedure 


The predictor, The MTAI was selected a 
the predictor for this study since it ateria 
to measure the kinds of teacher attitt! Ja- 
which are relevant to teacher-student k è 
tions. Two studies of the validity % 3): 
MTAT have already been reported (2, 
In each of these it was found that the M net 
would predict a three-fold criterion of te? te 
student relationships to the extent indie 
by a correlation coefficient of .59- 
MTAI contains 150 attitude statements 9 
Which the teacher responds with one ° was 
Possible responses. The scoring sy = 
determined by purely empirical mean’ om 
Responses to the MTAI were secure riot 
one group of teachers judged to be SUP thet 
in their relations with students and ation? 
group judged to be inferior in their ret ov? 
with students. The per cent of each 8 


5 
ine WA 

: s 
choosing the various response categorie? 


r 
iffe 
computed and the significance of the datel 
ence between these percentages was ent 
mined, 


A significant difference in peared 
age favoring the superior group wer a pe 
“+ 1" a significant difference favori”? Ay 
inferior stoup was scored “— 1”; a gol 
significant differences were scored “0- 
lowing is an example: 


MTAI in Predicting Interpersonal Relations in Classroom 


83 


Item: Most children are obedient. 


Strongly 
Agree 
Superior group 34% 
Inferior group 18% 
Differences in % +16 
Scoring ms 


oe be argued, on logical grounds, that 
apace > and “strongly disagree” re- 
om, categories should be scored oS 1?’ 
iea in the past logical face validity for 
to be ining scoring systems -has been found 
Psychologie a notoriously poor predictor of 
the er eae functions, that the authors of 
ased AI decided to use a scoring system 
T; on empirical data only. 
scribe criterion. A major task was to de- 
Which ade quately the kinds of relationships 
e bE xisted in each of several classrooms. 
ship f ained three estimates of this relation- 
Shey different sources. First we ob- 
each such an estimate from the students 1n 
7 classroom. This was obtained through 
“M -item questionnaire or inventory about 
all A Teacher,” which was administered to 
tacted dents in attendance the day we con 
as th the class. This inventory is the same 
tions. one used by Leeds (2). Such ques- 
School» these were asked: “Do you like 
is t ‘Ts this teacher often bossy?” “Is 
n ac usually kind to you?” The in- 
e TY was scored “rights minus wrongs. 
~,, Possible range in scores was +47 to 
Therefore, a score of zero indicates 


tia’ 

criti the student made as many negative 

sta, SMS of the teacher as he made positive 
e woul 


$ ae about him. The zero scot 
The cw that expected for an average teacher. 
Mean score on the student inventory for 


each 
psi was obtained. This mean mo 
the ; uted the evaluation by the students 0 

rticular 


clasg tet Personal relations in that pê 
toom, 
telat Second evaluation of the classroom 
Schoo]. was made by the princip o i 
the f The principal made his evaluation i 
Taling of a rating scale. This 1S the =p 
Leeds Scale which was designed and used by 
Scale. 2). Items 1 through 6 of the gee 
Yielg; Were scored on a 5-point scale, thu 
Mg a possible range in scores of 


; 2 Strongly 
Agree Uncertain Disagree Disagree 
58% 4% 3% 1% 
64% 4% 13% 1% 
6 0 —10 0 
aif 0 -1 0 
through 30. When the ratings were in- 


spected, it was found that there were wide 
discrepancies among the means of ratings 
made by the various principals. We con- 
sidered that these discrepancies could be due 
to: (a) wide variation in the leniency of the 
raters; or (b) wide differences in the quality 
of teachers in the various school buildings. 
It was necessary to assume one or the other 
in analyzing our data. We chose the former. 
Consequently, the principal ratings were ex- 
pressed as deviations from the mean of the 
particular rater. That is, all the ratings 
which each principal made were averaged and 
each teacher’s score was expressed as a devia- 
tion from that mean. In this way we, in 
effect, equated all schools on the quality being 
rated. This assumption of equality of all 
schools in classroom relations is not neces- 
sarily justified, but it was our opinion that 
less error would result with this technique 
than to assume that all raters (principals) 
were equally lenient in their ratings. It 
would have been desirable to equate .the 
variability of each set of ratings in addition 
to equating the means. This was not done 
because several sets of ratings contained only 
three or four cases. 

The third estimate of the classroom rela- 
tions was made by two observers from our 
research team. Each observer visited the 
classroom at different times and observed 
the class in process for thirty minutes to an > 
hour. Independent of each other they re- 
corded their observations on a rating scale. 
Items 1 through 5 on the rating scale were 
scored on a 5-point scale for each item, thus 
yielding a possible range in scores from 5 
through 25. Each observer’s ratings were 
converted to standard scores (based on his 
own distribution) and then the two stand- 
ard scores were averaged to arrive at the 

“mean observer rating.” The 


criterion of 2 ob i 
«mean observer ratings ’ constituted the third 


84 


estimate of the criterion of classroom rela- 
ae of the three above criteria—student 
ratings, principal ratings (deviation scores), 
and mean observer ratings—were converted 
to standard scores and summed. The sums 
of the three criteria scores were converted 
to standard scores and called the composite 
criterion. This last step was done merely 
to facilitate inspection of our data. 

The sample. The sample for this study 
consisted of 77 public school classes in cen- 
tral Missouri. Grades four through ten in 
four school systems were represented. The 
Population of these cities varied from 7,500 
to 26,000. There was only one teacher con- 
tacted in these four school systems who de- 
clined to Participate in the study. There 
were 82 classes in the original group from 
these four cities. The sample was reduced 
to 77 due to incomplete data, In these four 
cities all Negro children attended a school 
Separate from the schools for the white chil- 
dren. None of the Negro classes was in- 
cluded in this study. The grades included 
varied from city to city, depending upon the 
Organization of that Particular school sys- 
tem. In grades four through six, the classes 
met as the usual elementary school class, 
In grades seven through ten, the grades were 
organized on a typical high school plan. Of 
the 77 teachers there were 8 male teachers, 


48 married female teachers and 21 single 
female teachers, 


Results 


Table 1 presents the means and standard 
deviations for the Predictor and the various 
criteria. The means of the ratings made by 
our two observers are quite similar; how- 
ever, the variance of ratings made by ob- 
server X was significantly greater than for ob- 
server Y (F = 2.13; nm = n = 16; P< .01) 

1This was the unpublished fo; 


The form which was published 
study (1) varies in a few minor 
the one used here. 


Tm of the MTAL, 
subsequent to this 
details only from 


Robert Callis 


Table 1 


Summary Statistics for the Predictor (MTAT) 
and the Various Criteria 
Note: N = 77 classes. 


Standard 
Variable Mean Deviation 
Criteria: p a 
(1) Observer X Ratings 18.7 K 
(2) Observer Y Ratings 18.1 aa 
(3) Mean Observer Ratings 49,8 ee 
(4) Student Ratings 24.4 u a 
(5) Principal (deviation score) 5047 a 
(6) Composite 49.7} 10. 
Predictor: a 
MTAI 27.5 35. 


a few 
t Based on standard scores computed nom aa 
more cases than the 77 teachers in the corre 


analysis. Values other than those marked (H) are b 
on raw scores, 


The two mean ratings are only slightly eee 
than the middle or average point on the bie, 
scale. This agrees with our more subject 
impression that we were dealing with a typi” 
or average group of teachers. in 
The ratings made by the principals ate - 
sharp contrast with those made by our the 
servers. The mean of ratings made by si- 
principals was 25.0, while the highest P 
ble rating was 30. The principals’ ratia 
would characterize the group of teachers i 
highly superior in their relations with $ 
dents, ry 
The mean score on the student inven 
r all the teachers was 24.4, where the pO ere 
© Tange of score was + 47 to — 47. TP m 
are no norms available with which to ©? 
pare this value. „pated 
he mean MTAI score of 27.5 is estinm e 
to be about average or slightly below aver 
or experienced teachers. No directly © 
Parable norm group was available; howe“ 
norms on somewhat similar groups (begino 
teachers and graduate students in educa 15 
who had at least two years’ teaching exP 
ence) suggest the above interpretation. o 
here appears to be fair agreement an a the 
yee means of the various measures wit 
exception of the Principals’ ratings. 
. ve intercorrelations among the v4 
criterion measures, as shown in Table 2, ¥ 


fo 


er) 


rjo 
er? 


) 


re 


MTAI in Predicting Interpersonal Relations in Classroom 85 


Table 2 


Intercorrelation of the Predictor (MTAI) 
and the Various Criteria 
Note: N = 77 classes. 


Stu- Prin- 
dents’ cipals’ 
Ratings Ratings MTAT 


(1) Observers’ Mean 


Ratings! age dko. A 
(2) Students’ Ratings ages ad 
Principals’ Ratings 
ac (deviation scores) 1 
‘omposi Fa 
ET posite of (1), (2), (3) "30" 


Omposite of (1), (2) 


1 
Ne correlation between the ratings of the two 


Servers was .33, 


Ignific; ater tha a 
level of sec ae prenter than zero at t 


he 1 per cent 
a low with the exception of the correla- 
ratin of .46 between principal and student 
Scor 8S. The correlations between the M 
ante and the various criteria, singly an 
€xce ined, were significantly greater than zero 
rati Pt for the principals’ rating: students 
ae = .49; mean observers’ ratin 
Cipals’ ratings = .19; composite of the 
vy) te criteria = .46; and the composite of 
i X tvers’ and students’ ratings = 50. Thus, 
di pears that with the MTAT we can pre 
will hä kind of interpersonal relations whic 
exist in the classroom about as well as 
an predict academic performance by use 
intelligence tests. Presumably we are 
“suring an aspect of personality which we 


ings = 40; 


may refer to as “teaching personality.” By 
“teaching personality” we mean those char- 
acteristics of the teacher’s behavior tenden- 
cies which are associated with the teacher’s 
ability to establish harmonius working rela- 
tions with students. 

The results of this study are in general 
similar to the ones conducted by Leeds (2, 
3). The one major discrepancy between these 
similar studies is in the principals’ ratings. 
In Leeds’ studies the MTAI scores correlated 
with the principals’ ratings with coefficients 
of .43 and .46. This is a somewhat higher 
coefficient than that obtained in the present 
study. The correlations of the MTAT scores 
with each of the other estimates of the cri- 
terion, that is, the observers’ ratings and the 
students’ ratings, were rather similar in the 
two studies. The correlation of MTAT scores 
with the composite criterion in this study was 
46 as compared with .59 and .59 in Leeds’ 
studies. It would appear then that we have 
a good start in finding predictors for our 
criterion of human relations in the classroom. 


Received June 16, 1952. 


References 


ok, W. W., Leeds, C. H., and Callis, R. The 
Minnesota Teacher Attitude Inventory. New 
York: Psychological Corporation, 1951. 
LEE H. A scale for measuring teacher-pupil 
4 attitudes and teacher-pupil rapport. Psychol. 
Monogr., 1950, 64, No. 6 (Whole No. 312). 

s Leeds C: H. A second validity study of the Min- 
2 nesota Teacher Attitude Inventory. Elem. Sch. 
J., 1952, 52, 398-405. 


1. Co 


Tue JOURNAL or APPLIED Psycuotocy 
Vol. 37, No. 2, 1953 


Fakability of the Jurgensen Classification Inventory 


H. P. Longstaff 


University of Minnesota 


and 


C. E. Jurgensen 


Minneapolis Gas Company 


With the increasing use of psychological 
tests by industry as an aid in selecting per- 
sonnel, considerable interest has developed 
in the problem of malingering on such tests. 
This is especially true of the pencil and paper 
type of interest and personality inventory. 
Bordin (1), Longstaff (4), and Strong (7) 
have shown that the Strong Vocational Inter- 
est Blank is fakable. Longstaff (4) also dem- 
onstrated the fakability of the Kuder Prefer- 
ence Record. Meehl and Hathaway (6) have 
found the Minnesota Multiphasic Personality 
Inventory fakable. Tiffin (8) reports a study 
which showed that the Humm-W adsworth 
Temperament Scale could be faked. Wes- 
man (9) has demonstrated similar results for 
the Bernreuter Personality Inventory. 

At least four different methods have been 
used to try to correct this weakness. One: 
The development of scoring keys to detect 
faking. The “L” scale of the MMPI (2) 
Two: Development of 
scales to correct scores 


Classification Inventory (3). 
feature of such i 


represent only “good” 
It was hoped th i 
the weakness 
mixed “good” 
lingerer could 


86 


traits and did not-possess any of the “bad 
ones. 

Mais (5) developed and cross-validated 
a “self confidence” key for the Jurgensen 
Classification Inventory. He then had col- 
lege students take the test honestly and dis- 
honestly, i.e., trying to fake a high score in 
self-confidence. He found the mean score for 
his group changed from — 5.9 (honest) tO 
6.9 (faked). This difference of 12.8 was 
significant at the .01 level. The Pearsonian 
correlation between the two scores was only 
iT, 


E edi 
Mais’ study gave adverse data on th 


Jurgensen Classification Inventory. How" 
ever, it was not a crucial study. The Cus 
fication Inventory was developed for use ! 
Personnel selection. In such industrial us® 
applicants do not know what “traits” a 
being measured. In fact, most keys are 20 
based on traits but on over-all job success: 
It may be one thing to raise a score on § 
specified and named trait such as gk 
fidence and another thing to obtain a hig 
Score on undefined job success. Ne: 

The present study was designed to inves a 
gate further the fakability of the Classif 
tion Inventory. Two groups of Universi? 
of Minnesota students in personnel psyco 
08y courses served as subjects. Group A ©? è 
sisted of 41 juniors, seniors and gradua 
students, the majority of whom had co! 
pleted numerous courses in psychology 4? 
industrial relations. Group B consisted | g 
37 extension division students in an eveni” 
class, and represented a less highly select? 
group than the first. 


Method a 
Each student took the Classification IN 
ventory under three sets of conditions: 


Fakability of the Jurgensen Classification Inventory 


So (2) fake good over-all, and (3) fake 
igh self-confidence. Directions for the test 
under these three sets of conditions were as 
follows: 


Sia! score. “This test has been con- 
tests. I quite differently from most personnel 
Peen. K t has been tried out in industry and has 
ince? lenomenally successful in certain instances. 
my you are students of personnel psychology, 
ra eae in giving you the test is threefold: 
actual] want you to become acquainted with it by 
test E taking it; second, your standing in the 
Vocatice, be of assistance to you in planning your 
that wan future; third, we hope to build a key 
away f assist us in directing students toward or 
answe Tom personnel work as a vocation. Please 
ey r the questions as accurately as you can as 
the apply to yourself. It will be obvious from 
ee ge that there are no right or wrong 
etence ¢ It is wholly a matter of personal pref- 
ions on your part; therefore, answer the ques- 
ty they apply to you.” 

took an over-all good score. “Last time, you 
in whi e Classification Inventory under conditions 
ich you were instructed to answer the ques- 


87 


tions as they apply to you. In taking the test 
this time imagine yourself in an employment de- 
partment of a large and prosperous company. 
You have finished your education and are now 
starting out upon your life’s work. You want 
very much to get a job with this company and 
hope to spend the rest of your life working for 
them. Therefore, you want to make as good an 
impression as you can. Answer the questions so 
you will appear in the most favorable light to 
the personnel manager.” 
3. Fake high self-confidence score. “You have 
taken this test twice before. Today I would like 
to have you take it trying to fake your answers 
so as to make a high score in self-confidence.” 


Means and sigmas of scores obtained under 
the three conditions are given in Tables 1 
and 2. Results support Mais (5). Students 
significantly increased their scores in self- 
confidence when they attempted to do so, the 
increase averaging approximately one sigma. 
This increase is both statistically and prac- 
tically significant. Statistical significance 
(t= 8:75) is beyond the .01 level. 


Table 1 


Under Three Sets of Conditions 


— Mean Scores on Self-Confidence Key 
i e E 7 gt ce ae = = 
est or Fake Fake 
yeas Over-all High Self- 
` Score Good Score Confidence 
Group A = x ee 
we —2.37 07 12.22 
(N = 41 University students) 2.37 G 
Group B ae 
95 10.45 
(N = 37 Extension students) 2.14 ə 
T Soup — .23 1.44 11.42 
N = 78) 
Table 2 
9 mas) on Self-Confidence Key Under Three Sets of Conditions 
= Variability of Scores (Sigma 
ae J 
Honest or Fake Fake 
Accurate Over-all High Self- 
Score Good Score Confidence 
Group A 9.52 6.71 10.62 
(N = 41 University students) 
Group B 11.46 11.43 11.73 
(V = 37 Extension students) 
Total Group 10.72 9.36 11.19 


(N = 78) 


88 H. P. Longstaff and C. E. Jurgensen 


The situation is different when we compare 
“honest” scores with attempts to fake “over- 
all good” scores. The increase is neither 
statistically nor practically significant. Obvi- 
ously, these students were unable to increase 
their scores in self-confidence when attempt- 
ing to appear in the most favorable light. 
However, the similarity of mean scores does 
not give the whole story; and the test is not 
as satisfactory for employment use as might 
appear. The Pearsonian correlation between 
“honest” and “fake over-all good” scores was 
only .28. Obviously, many of the students 
had attempted to increase their score. Al- 
though scores in general were not raised, they 
were changed. This, of course, would be a 
serious defect if the test were being used for 
selection. 

Consideration of the foregoing findings 
raises an important question. Did students 
change their answers because they thought 
they could improve their scores or because 
they were instructed to fake? Essentially, 
„they were directed to change answers and 
perhaps we should not be too surprised when 
they follow instructions. 

To investigate this question, another group 
of 68 students comparable to the previous 
Group A was given the Classification Inven- 
tory with instructions which avoided direct 
orders to fake answers and which simulated 
more nearly industrial selection and voca- 
tional guidance conditions. Directions for 


the test under these two conditions were as 
follows: 


1. Industrial selection, “In taking this test 


make the following assumptions. You have just 
finished your college work 


meetin, 
of the class you took the Classification Inventory 
assuming you were applying for a Job. Today I 


tion you should go into. You finally decide to 
go to The Student Counseling Bureau to see if 
they can give you any assistance. The counselor 
informs you, ‘We have a battery of tests we 
should like to have you take. We have found 
the results very helpful in dealing with problems 
like your own. The first test in the battery is 
called the Classification Inventory. Will you 
please read the directions and then answer the 
questions.’ ” 


Again it was found that mean scores were 
essentially the same. The mean for the “In- 
dustrial” situation was — 1.28 (sigma of 
8.98) and that for the “Vocational Guid- 
ance” situations was — 2.18 (sigma of 9.28). 
This difference is not statistically significant 
(¢ = 81). 

The correlation between “Industrial” and 
“Guidance” scores is .50. This is a substan- 
tial increase from the former coefficient of 
-28. It is apparent that the degree of faking 
is materially reduced by avoiding the direct 
Suggestion to change answers, It should be 
pointed out that we are dealing in these ex- 
periments with a very intelligent and psy- 
chologically sophisticated group of subjects: 
That this is an important factor can be see? 
by comparing the results of Group B (Table 
1 and 2) with the other groups. The ex- 
tension students (Group B) increased their 
Scores considerably less than did the other 
groups comprised of more highly selected 
students. Be this as it may, the fact clearly 
stands out that all three groups materially 
changed their answers and scores under the 
different sets of conditions. Although modi- 
fication of directions toward greater realism 
decreased the extent of change, the resultan 
correlation of .50 is not encouraging. obvi- 
ously faking is possible in the Classification 
Inventory, and probably occurs when the 19° 
strument is used for employee selection put 
Poses. Unfortunately, the extent of suc 
faking cannot be determined for any single 
applicant. 4 

Although various forced-choice tests diffe" 
in the way in which items are selected, thes¢ 
differences would not appear to be related to 
attempts to fake. Presumably, findings fro™ 
these experiments might well be expecté 
to apply to the forced-choice technique I” 


_ general. 


Fakability of the Jurgensen Classification Inventory : 89 


Summary 


ee on self-confidence were signifi- 
raise ee when students attempted to 
i ir scores and knew the test meas- 
ed self-confidence. 
ee on self-confidence were not in- 
good o when students attempted to fake 
ow ver-all scores when students did not 
that the test was scored for self-con- 
ence, 
a ANa on self-confidence were not in- 
lated Find ee students changed from a simu- 
rame a ustrial” to a simulated “guidance” 
now reference when the students did not 
that the test measured self-confidence. 
worded way in which instructions were 
eke materially affected the extent of at- 
peed faking. 
When enous mean scores were not increased 
cing udents did not know what trait was 
Quently ope individual scores were fre- 
ead changed to a considerable extent. 
ar ried evidenced by correlation coefficients 
S aoe than reliability coefficients. So far 
tempt i interpretation is concerned, the at- 
Portant improve scores is probably as 1m- 
OW sh as the ability to improve scores. 
int ould a score at the fiftieth percentile 
amount = reted? Does it reflect an average 
the re of the trait being measured? Is it 
w pant of a successful attempt to raise a 
cessful Or is it the result of an unsuc- 
read attempt to further increase what Js 
like Y a high score? The answer 1s un- 
6 Ks be known in any single case. _ 
recom e Classification Inventory 3S 
mended for use in situations where Per 


So: 7 
Boo, are likely to be motivated to obtain 
q Scores, 


not 


7. Although these data were obtained on 
the Classification Inventory, this is no reason 
to believe that different results would be ob- 
tained from any other forced-choice per- 
sonality test. 

8. It is the opinion of the authors that 
techniques other than the forced-choice tech- 


- nique will have to be devised if the problem 


of malingering on personality tests is to be 
overcome. 


Received May 22, 1952. 


References 


1. Bordin, E. G+ A theory of vocational interests as 
dynamic phenomena. Educ. psychol. Measmt., 
1943, 3, 49-75. 

2. Hathaway, S. R., and McKinley, J.C. A multi- 
phasic personality schedule (Minnesota): I. 
Construction of the schedule. J. Psychol, 
1940, 10, 249-254. 

. Jurgensen, C. E. Report on the Classification In- 
ventory, a personality test for industrial use. 
J. appl. Psychol., 1944, 28, 445-460. 

4. Longstaff, H. P. Fakability of the Strong In- 
terest Blank and the Kuder Preference Record. 
J. appl. Psychol, 1948, 32, 360-369. 

. Mais, R., D. Fakability of the Classification In- 

ventory scored for self confidence. J. appl.» 
Psychol., 1951, 35, 172-174. 

. Meehl, P. E., and Hathaway, S. R. The K fac- 
tor as a suppressor variable in the MMPI. J. 
appl. Psychol., 1946, 30, 525-564. 

7, Strong, E. K., Jr. Vocational interests of men 
and women. Stanford, Calif.: Stanford Uni- 
versity Press, 1943. 

Tiffin, Joseph. Industrial psychology. New York: 
Prentice-Hall, 1942, pp. 117-118. 

9. Wesman, A. G. Faking personality test scores in 
a simulated employment situation. J. appl. 
Psychol., 1952, 36, 112-113. 

10, Wiener, D. N. Subtle and obvious keys for the 

MMPI. J. consult. Psychol., 1948, 12, 164- 


176. 


o 


n 


Tue Journat. or APPLIED Psycnotocy 
Vol. 37, No. 2, 1953 


The Relationship Between the Judged Desirability of a Trait and 
the Probability That the Trait Will Be Endorsed * 


Allen L. Edwards 


The University of Washington 


There is a rather common suspicion among 
many psychologists that subjects tend to give 
_ what are considered to be socially desirable 
responses to items in personality inventories. 
This suspicion has been given public ex- 
Pression in a recent article by Gordon (3, 
P. 407) who comments upon “. . . the mo- 
tivation of a majority of respondents to 
mark socially acceptable alternatives to items, 
rather than those which they believe apply 
to themselves.” 

We have here two problems. 
cerns the truthfulness of a subject’s answers 
to items in a personality inventory, i.e., 
whether the response accurately describes the 
subject. The answer to this question im- 
plies that we have available some independent 
criterion in terms of which the inventory 
response is to be evaluated. The other prob- 
lem concerns the relationship between a sub- 
ject’s response to an item and the social 
desirability of that item, ie., whether the 
subject tends to give a positive answer to 
an item that is socially desirable and a nega- 
tive answer to an item that is not. The 
answer to this question implies that we have 
available some measure of the social desira- 
bility of the item to which the response can 


be related. It is this problem we wish to 
report upon here, 


One con- 


The Present Study 


presented 
Psychologica] Association, Fr before the Wes 


ape It is part of a 
Possible by an appointment as 


Fellow of the Social Science Research Cou 


items is a monotonic increasing function of 

the scaled social desirability of the items. 
To study the relationship between thg 

probability of endorsement of personality 


“trait items and the social desirability of the 


90 


items requires that we determine independ- 
ently two measures: the probability of en- 
dorsement and the social desirability scale 
value of the items. This study thus consi 
of two parts: in the first, the scale values 0 
the items are determined; in the second, the 
Probability of endorsement is related to the 
independently determined scale values. 


Determining the Scale Values 


A total of 140 personality trait ee 
based upon Murray’s (4) discussion of ee 
were written and edited. The items: ba 
selected so that 14 needs were investigate 
with 10 items supposedly indicative of Be 
need. The items were arranged in 10 sets 
of 14 items each, so that each set cous 
of one item relating to each of the ae 

The items were presented to subjects pe 
instructions to judge the degree of social ‘ 
sirability of the behavior indicated by ree 
item in terms of how the behavior would e 
regarded in others. Judgments were me 
in terms of nine successive intervals, with T 
lowest interval representing extreme yet 
sirability and the highest extreme desirability 
The rating system was explained in ter - f 
a sample set of four items for which bee 
ments had already been obtained. A ie 
these ratings had been discussed, the ine 
structions to the subjects concluded with 
following statement: de- 

“Indicate your own judgments of the ch 
sirability or undesirability of the traits wh! 
will be given to you by the examiner in t 
Same manner, Remember that you are j 
judge the traits in terms of whether you © ci 
sider them desirable or undesirable in other 


| 


) 


Judged Desirability of a Trait and Probability of Endorsement 91 


Be sur ere 
trait.” e to make: a judgment about each 
ag Subjects judging the desirability of the 
a consisted of 86 men and 66 women, a 
subj of 152 subjects. Twenty-six of the 
mn were under 20 years of age, 97 were 
0 veen 20 and 30 years of age, and 29 were 
Ver 30 years of age. 
we mulative distributions of the judgments 
grou made separately by age and by sex 
3 a _For each item we then found the 
tion val in which the median of the distribu- 
of judgments would fall. 
4 Figure 1, we show the plot of the 
Mine intervals against the corresponding 
p "es for the men, It may be noted that 
Rea case of only two items would the 
intervat be separated by as much as two 
might S. For 43 of the items the medians 
ne j Possibly be separated by as much as 
e oe For the remaining 95 iims 
interval would all fall within the same 


mente case of many of the items falling 
the ae the principal diagonal of Figure Í; 
me caine would still be approximately the 
istributs the reason that the medians of both 
intery, prong are close to the limit of the 
aboy al, but one happens to fall slightly 
© and the other slightly below the limit. 
Similar analysis of the judgments was 
a Tie terms of the age variable. pat 
that the the separate distributions iiigieati 
taingg © Scale values that would thus Beon 
dis Ë „Would be comparable and that Tithe 
the hda would be introduced by pooling 
ments for all groups. E 

tion the basis of iie abn distribu- 


Ñ i ere 
foung he scale values of the 140 itens F F 


Wo 


ade 


the me The scale values were determ TH 

Metho, od of successive intervals Se a 
: : A 

Sumpa, Of scaling does not involv ee 


i i 2 
Mtervate of equality of the successive 


tasg "determining the widths of io 

items intervals and the scale values ° 5 

tial got the psychological continuum of nt 

Was ay ability, an internal consistency 

Caley PPlied (1). Using the 147 Lage ‘0 

Prog, ted from the data, it was POSS! sea 
Ce the 1,120 independent, emp! 


Bm Om 


Fic. 1. Interval in which the median of the wom- 
en’s distribution of judgments would fall plotted 
against the interval in which the median of the men’s 
distribution of judgments would fall. 


observations with an average error of .023. 
This value, it may be mentioned, compares 
favorably with that usually obtained from 
internal consistency tests used when stimuli 
are scaled by the method of paired com- 


parisons. 


Relationship Between Scale Values and 
Probability of Endorsement 


In the second part of this study, a sample 
of 140 pre-medical and pre-dental students 
responded to the same set of items for which 
we had previously determined the scale values 
on the psychological continuum of social de- 
sirability. This time, however, the items ap- 
peared in a printed form as a personality in- 
ventory. The inventory was part of a test 
battery which was administered for the Medi- 
cal and Dental Schools of the University 
of Washington. The instructions were those 
that are commonly used with personality in- 
ventories. A “Yes” response indicated that 
the subject believed that a given item was 
characteristic of himself and a “No” response 

it was not. 
an nits were made for each item, by 
ansi OF IBM equipment, and the per cent 
me: nding “Yes” was then found for each 
respo This per cent is the proportion of the 
awit indicating that the behavior stated 


92 Allen L. Edwards 


5 1.0 1.5 2.0 


5 5. 5.5 4.0 


SCALE VALUES 


Fic. 2. Probability of endorsement of a trait it 
the item. The product-mo 


by a particular item is characteristic of them- 
selves, The Proportions may be taken as the 
Probability of endorsement of a particular 
trait item for the sample at hand. 

The probability of endorsement of each 
item was plotted against the previously, and 


independently, determined social desirability 
scale value of the it 


in Figure 2, 
probability of e 


the social desirability scale value. 


of the item, 
lation Coefficient is .871, 


Discussion 


The data clearly indicate that the prob- 
ability of endorsement of an item increases 
with the judged desirability of the 


1 There is a slight indication of 
earity at the two extremes apartir 


of 
em plotted against the social desirability scale value 
ment correlation coefficient is 871, 


This does not necessarily mean that the sub 
jects are misrepresenting themselves on A 5 
inventory. It may be that traits which ay 
judged as desirable are those which are mn 
widespread or common among members ii 
a culture or group. That is, if a patter? a 
behavior is prevalent among members P is 
group, it will be judged as desirable; if ia 
uncommon, it will be judged as undesira™ 
We might thus expect items indicating dy 
sirable traits to be endorsed more frequen 
than items indicating undesirable traits. in- 

Tt is also possible that the behavior 1 
dicated by an item with a high social 
Sirability scale value is not common, try: 
that the subject taking ‘the inventory ÍS Z 
ing, consciously or unconsciously, to ale: 
good impression of himself, He there f 
tends to distort his answers in such a wee 
to make himself out as having more of 50° 
socially desirable traits and fewer of the the 
cially undesirable traits than might be 
case if his behavior were evaluated in te" 
of some other independent criterion. 


Judged Desirability of a Trait and ‘Probability of Endorsement 93 


Either one or both of the interpretations 
eo would account for the relation- 
ae probability of endorsement and 
As esirability of the item. I have no 
subje od Support the interpretation that the 
me misrepresented themselves on the 
PE ory, but Ellis (2) in his recent review 
di quite a few studies which would in- 
‘cate that this is the case. 
teat this is true, then in a personality inven- 
i y w should attempt to minimize the tend- 
me, or a given response to be determined 

marily by the factor of social desirability. 
oo solution is to pair items indica- 
desi of different traits in terms of their social 

irability scale values. If the subject is 


then forced to choose between the two items, 
his choice obviously cannot be upon the basis 
of the greater social desirability of one of 


the items. 


Received June 3, 1952. 


References 


1. Edwards, A. L. The scaling of stimuli by the 
method of successive intervals. J. appl. Psy- 
chol., 1952, 36, 118-122. 

2. Ellis, A. The validity of personality question- 
naires. Psychol. Bull., 1946, 43, 385-440. 

3. Gordon, L. V. Validities of the forced-choice and 
questionnaires methods of personality measure- 
ment. J. appl. Psychol., 1951, 35, 407-412. 

. Murray, H. A. Explorations in personality. New 
York: Oxford University Press, 1938. 


+ 


THE JOURNAL or APPLIED PsycHoLocy 
Vol. 37, No. 2, 1953 


A Note on “Interest Item Response Arrangement” 


John V. Zuckerman 


Human Resources Research Ofice, The George Washington University 


In a private communication, Cronbach has 
called my attention to some aspects of my re- 
cent article (5) which should be clarified. 
Some methodological problems were not suffi- 

- ciently explained in the original article, a 
basic assumption was left unstated, and in 
addition some violence was done in citing 
Cronbach’s position with respect to the effect 
of item response arrangement on measure- 
ment of traits or qualities. The points to be 
considered may be examined topic by topic. 


Reliability 


In a comparison of two interest test forms, 
FE (with 168 two-choice items) and OE (con- 
taining 112 L-I-D items), four product-mo- 
ment reliabilities (corrected for test length by 
the Spearman-Brown formula) were computed 
for four empirical keys, using odd-even tech- 
nique. With one exception, the reliabilities 
were similar for the two forms. For the key 
which was discrepant, OE had a higher re- 
liability. Odd-even reliabilities cannot be in- 
terpreted as estimates of test-retest reliabili- 
ties, particularly because, for L-I-D or similar 
scales, odd-even figures would be raised in 
the event that a transient response set were 
affecting performance throughout the test. 
Evidence of such response set might be found 


in the number and direction of weights for 
respo: 


-item consistency but a high 
A low split-half cor- 

obscure the high real relia- 
Retabulation 
bers of positio 


. Shows that ther 


of my data in terms of num- 


teachers to disli 
trators, admi 


than teachers, and teachers to dislike mor 
things than educators in general. ie 

These tendencies were capitalized on by To 
empirical scoring keys used in the study. L 
determine whether reliabilities of those is : 
keys are lower than for forced-choice Ker 
the study would require the addition of Ei 
retest reliability information which is P” 
presently available, 


Validity T 
My study was intended to compare the ca 
tive discrimination provided by forced-che be 
and L-I-D item forms. The experimenta is 
sign involved the assessment of relative i 
crimination of four scales by rescoring bla a5- 
of most of the original subjects. The for 
sumption was made that any shrinkage ales 
OE scales would be the same for FE E 
upon a cross-validation. Cronbach points em 
that when L-I-D or similar three-choice it ili- 
are assigned weights, there are more poss! nce 
ties that weights could arise out of cima 
differences than where forced-choice pails nce 
used. He states further that the more cha pe 
discriminations are counted in the a 
more the validity will shrink on a fresh tions 
ple, based on mathematical considera 


Table 1 


OE A 
Retabulation of Zuckerman’s Data (5) in Te 


of Numbers of Positions Weighted p 
No. a bl 
of Positions Weighted 
Scale Items ee 
Name Weighted L+ L— I+ i= 32 
EDENG 94 66 16 3 39 2 2 
ADM 38 ii o 2 5 15 
TEA 41 1 18 22 #1 i 22 
AD-TEA 4 95 6 6 15 
r$ 
catte 
* Positive direction of weights in favor of edu th 
ADMD ENG, for adminin and toer in Wy 
M an BA inistra! ion 
AD-TEA al scales, and for admini: jol 


- See original article for explan® 
scale construction (5), 


SS o 
Die m 


a 


Note on “Interest Item Response Arrangement” 95 


Probable shrinkage, according to Cronbach, 


depends on factorial complexity of the item 
matrix, the number of items (or weights), the 
ie of subjects tested, and the criterion 
ee No data are available from my 
i which bear on the problem of differ- 
ae shrinkage for different item forms. To 
t e this point, a follow-up study will have 
© be made in which cross-validation pro- 
cedures are used. 
Pa a study intended to achieve the same 
olo as mine, but with a different method- 
8y and subject matter, Gordon (4) found 
= ctced-choice personality questions pro- 
tatin more discrimination than open-ended 
eriz 8 scale statements. Differences In Cri- 
a, measuring instruments, and methods 


Pre 5 i ! 
ant direct comparison of the studies, how- 


Response Set 


ahile the general tone of Cronbach’s 
un = articles on response set (1, 2) was 
ice toward the use of item forms 
bilit as L-I-D, he did clearly raise the possi- 
Por that it is desirable to capitalize on re- 
anq oa variance (especially 2, PP- 17, Zi 
sitio 8). My statements regarding his po- 
n (5, pp. 79 and 84) were in error. 


recent study by Cronbach and another author 
(3) expresses Cronbach’s current appraisal of 
the problem in terms of a mathematical con- 
sideration of profile analysis. Response-set 
may appear as a mathematical factor entitled 
clevation. The investigator is advised to con- 
sider the meaning, if any, of the factor, and 
determine whether it is to be included in his 
scoring procedure. There is no basic dis- 
agreement between Cronbach and myself on 


this point. 


Received January 21, 1953. 
Published out-of-turn by the editor. 


® References 


1. Cronbach, L. J. Response sets and test validity. 
Educ. psychol. Measmt., 1946, 6, 475-493. 

2, Cronbach, L. J. Further evidence on response 
sets and test design. Educ. psychol. Measmt., 
1950, 10, 3-31. 

. Cronbach, L. J., and Gleser, Goldine C. Simi- 
larity between persons and related problems 
of profile analysis. Technical Report No. 2, 
ONR Contract N6ori-07135. Urbana, Univer- 
versity of Illinois, April, 1952. 

. Gordon, L. V. Validities of the forced-choice and 
questionnaire methods of personality measure- 
ments. J. appl. Psychol., 1951, 35, 407-412. 

Zuckerman, J. V. Interest item response arrange- 

á ment as it affects discrimination between pro- 

fessional groups. J. appl. Psychol., 1952, 36, 


79-85. 


w& 


> 


wn 


Tue Journat or Apprizp Psycuorocy 
Vol. 37, No. 2, 1953 


Effects of the Nature of the Problem on LGD Performance * 


Bernard M. Bass and Cecil R. Wurster 


Louisiana State University 


The basic scheme of group situational tests 
is to place examinees as a group in a prob- 
lem or work situation. Examiners observe, 
record, or rate examinees’ behavior as mem- 
bers of the group. The hypothesis under- 
lying the method is that the situational test 
is a valid sample of behavior for predicting 
future behavior in a real group situation. It 
has been verified by a number of studies 
(eigu2, 4511), 

Because of the wide variety of possible 
group situations, a large number of variations 
in group situational tests have been tried. 
Candidates for positions of leadership have 
been assessed: (a) in initially leaderless situa- 
tions (e.g. 4); (b) in situations where each 
candidate, in turn, has been appointed leader 
(e.g. 2); (c) in situations where a staff mem- 
ber has served as leader (e.g. 1); and (d) 
in situations where the leader has been elected 
by the group (7). Arbous and Maree have 
reported a median correlation of .67 between 
assessments based on observations of the 
same candidates in situations a and b. 

A problem for solution may or may not 
have been Presented. Some studies have 
given participants a choice of problems to 
discuss (e.g. 11); others (e.g. 9) have al- 
lowed the group to originate the problem; 
while still others have assigned the problem 
(e.g. 4), 

Various kinds of Problems have been pre- 
sented. These have included general inter- 
est Problems such as “Select the ten outstand- 
eter ea) mo 
ee = suc as: Develop a pro- 

pervisors in this plant” (3); 
hee Ree ee of human relations 
cide what the best an i Ta aka 

course of action will be (8). 


96 


The purpose of the present study was tw0- 
fold. The first aim was to see the extent to 
which a person’s successful leadership acting 
in an initially leaderless discussion change 
when there was a systematic change in m 
nature of the problem and the persons wit 
whom he was grouped. The second purpose 
was to see whether assessments based k. 
some types of discussion situations were mor 
related than others to various measures a 
company rank; education, intelligence, ys 
visory aptitude, age and appraisals of supe 
visory behavior on-the-job. 


Subjects and Method 


The subjects were a class of 23 studena xi 
an introductory psychology course and ts 
oil refinery supervisors. The 23 studeni 
were divided purposefully into three grou 
and each observed in a halfzhour LGD He. 
one of three types of conditions: (2) ead 
structured—participants originated pro ices 
for discussion; (b) general leader ae 
tions—e.g. participants developed a se 


. A eaci 
factors for choosing the world’s greatest | e 


ers; (c) case history—e.g. participant id 
cided whether a returning veteran aa pe 
tell his wife about an illegitimate chi 
fathered overseas, d 5° 
Then, three new groups were forme pree 
that as few members of the same first "od 
groups were together for a second time sse 
so that all participants could be assi 
under a condition different from the rallý: 
tion in which they were first tested. Fi” 50 
a third recombination was carried E 
that all 23 participtants were observed ns P 
each of the three conditions. Condi E 
and c were altered slightly on each suce ant? 
administration to avoid having participi d5 


E 4 pi”! 
specifically prepared. Thus, different ifie 
of specifications were demanded and ive 


. . i ces 
ent case histories were used in the a w 
administrations. LGD scores were ba 


Effects of the Nature of the Problem on LGD Performance 


Tabl 


Correlations Among LGD Scores Ea 


et 


rned by Participants Subjected to 


Three Different Types of Discussion* 


Type of LGD 


a i Unstruc- Leader Case 
Type of LGD tured Specifications History Average 
Unstructured = 358 66 62 
— ran + 


Leader Specification 
Case History 


58 


es ratings of the extent to which 
shi Participant exhibited successful leader- 
P activity in a given discussion.* 
ae 131 supervisors were assessed under 
ae pi four conditions: (a) unstructured; (b) 
ne leader specifications; (c) in-plant 
fan Specifications; (d) case history. Situa- 
i € concerned the specifications for select- 
8 shift foreman, supervisors and so forth. 
A case history concerned such problems 
what Mike should do when his superior 
fie him out in front of his subordinates 
What Harry’s superior should do when he 
a various faults with Harry’s method of 
Te ng a work gang. Five unstructured and 
T of each of the other types of situations 
oe d the 17 group discussions. LGD 
RE Were corrected for group size and varia- 
among observer's standards.” 


Co; 


Results 
the intercorrelations 


by the 23 par- 
f the three 


Table 

1 
al 
goong LGD 


lci 
haats on 


displays 
scores obtained 
es n the basis of each 0 
<S Of discussions. 
po an LGD test-retest reliability of Ja 
ap- €Ported for repeated discussions & W 
chart With changed participants but i z 
fins. of problem (8); and since it has ie 
temy °Ven lower (r = .53) when a year i 
the 26S between repeated measurements a 
ang tside status of some participants $ 
more than others (5), Ít Lie 1 
tom the results reported in ‘Ta E 
ip Ome, but not very much, variation 
Proceduri More detailed descr 


r oring 
> iption of the se 
Seg Foon cadet is referre 


d to (6). 


LGD behavior could be attributed to varia- 
tions in the nature of the stimulating situa- 
tion. 

The variation from .51 to .66 in intercorre- 
lation and from .54 to .62 in average inter- 
correlation are most probably due to chance. 
Since the usual validities of these various 
types of LGD’s range from .30 to .50, these 
intercorrelations suggest that to include more 
than one in a battery would not raise the 
validity very much of any two over the va- 
lidity of any one, although the reliability of 
the composite might be raised substantially. 

Table 2 indicates the correlations between 
two independent clusters of highly interrelated 
variables and LGD scores earned by the re- 
finery supervisors in one of four types of 
discussions. (The clusters were isolated by 
inspection of an intercorrelation matrix. 
Cluster I consisted of supervisor’s rank in 
the company, education, intelligence, super- 
visory aptitude and youth. Cluster pi con- 
sisted of superiors on-the-job appraisals of 
the supervisors by means of graphic and 
forced-choice rating scales (6).) Super- 
visors were classified into a lower and upper 
echelon of management. Correlations be- 
tween rank and LGD scores were biserial; 
the remaining were Pearson product-moment. 

Chi square tests of the significance of the 
s in correlations from one discussion 


variation ; 
type to the next * suggested that only one set 
of correlations—those between company rank 


scores—varied significantly at the 
t level of confidence. It was in- 


and LGD 
1 per cen 


3 This test is described a A Hepa. 
: ign in psye 
mental design i Ż 


ychological research. New York: 
Rinehart, p. 133-135. 


98 Bernard M. Bass and Cecil R. Wurster 
Table 2 
i visors Subj J f Four Types of 
T ion Between LGD Scores of Oil Refinery Supervisors Subjected to One o! : 
aS ins ko Situations and Their Rank, Education, Intelligence, Supervisory 
Aptitude, Age and Superior’s Appraisals = 
j Sub-Samples According to LGD Type =. 
Out-Plant TIn-Plant A 
Unstruc- Leader Leader Case 
tured Specifications Specifications History Teil 
(a) (b) (c) (a) 
17 
No. of Groups 5 4 4 4 ‘a 
No. of Subjects 35* 33* 31* 32 
Cluster I 88 
Rankt 387 81 -91 .99 37 
Education .63 AT .60 257 45 
Intelligence 50 61 AL 34 30 
Supervisory Aptitude 34 .28 07 54 7) 
Youth 30 01 31 24 
Cluster IT int 
Graphic Appraisal 02 —.38 —.11 — 04 ‘12 
Forced Choice Appraisal 02 —.22 —.12 28 TA 


* Because of missing information on intelligence, 


; 
: = r aisals many 
i g Supervisory aptitude scores and superior’s appraisals 
of the sub-sample correlations of LGD scores with the: 


f Sub-sample variation in correlation 
computed by means of biserial y, 


ferred from the correlation of 99 between 
rank and case history LGD scores that upper 
echelon supervisors were the sole leaders in 
such discussions; their tendency to exert lead- 
ership in the supposedly leaderless situation 
declined somewhat when discussions involved 
situations outside the company as in the out- 
plant leader specifications and the unstruc- 
tured discussions. (In the latter, the problem 
originated by the participants for discussion 
quite often concerned improving the town 
Sewerage system, increasing civic pride, and 


so forth.) The one hypothesis worthy of 
further investigation drawn from these re- 
sults, therefore, was that 


a supervisor of high 
rank is most likely to play the role of leader 
among persons of lower appointed rank when 
the group problem specifically concerns situa- 
tions for which he has the high rank. 


in the range 
especially, super- 
d superior’s ap- 
percentage of the 
for their Present 


S se variables 
l with LGD significant a 
The others are Pearson correl 


are based on as few as 21 cases. 


t the 1 per cent level of confidence. 
ations. 


‘This sè! 


je 
posts because of their high supervisory oF 
tude test battery scores. his- 

The correlation of .54 between Ae e 
tory” LGD scores and supervisory apt! ess" 
suggested a valid consistency between the 
ments based on the case history LGD an ic 
Supervisory aptitude battery—a battery pe il 
Save substantial weight to paper-and-P the 
tests of Supervisory judgment. Whe? in 
Masking influence of rank, the lower pati- 
bility and validity of the graphic in com the 
Son to the forced choice appraisal ant sal’ 
Steat restriction in range of the app cred 
were all taken into account, it was 1 ° js 
from the Correlation of .28 between cae ical? 
tory LGD scores and forced choice app", zely 
that the case history LGD is the most valid 
type of those investigated to provide 4 the 
Predictor of adequacy on-the-job, wher i 
®xaminees are of different known ie 
tional rank, and Previously have been 5€% est 
Y means of valid paper-and-penc' 
batteries, 


Summary to 


ere 
The purposes of the present study nangi 
see the effects on their behavior of ch 


Effects of the Nature of the Problem on LGD Performance 99 


the nature of the problem confronting LGD 
participants, 

a rors of 23 college students cor- 
neee gd G See repeated 
Sou ations where the composition of the 
sfoup and the problem for discussion were 
Systematically altered. These correlations 
a not much lower than the test-retest 
eliability (7 = 18) ot ane type ol LAD: 
ear este to which various personal fac- 
of ita associated with LGD performance 
some bes refinery supervisors depended to 
under extent on the nature of the problem 

discussion, Major findings were: 
ò A A high-ranking supervisor is more likely 
with ert leadership in small discussion groups 
cussio, “Pervisors of lower rank when the dis- 
Which } specifically concerns situations for 
he has the high rank. 
2. The amount of successful leader activity 
tion cussions of case histories of human rela- 
enep o aeins appears related to paper-and- 
= Predictors of supervisory success (7 
dicts and to a lesser extent with forced 
Suc € on-the-job appraisals of supervisory 
cess (7 = 28), 


in 


Roce; 
Ceived June 2, 1952. 


References 


1. Ansbacher, H. L. Lasting and passing aspects 
of German military psychology. Sociometry, 
1949, 12, 301-312. 

. Arbous, A. G., and Maree, J. Contribution of 
two group discussion techniques to a validated 
test battery. Occup. Psychol., 1951, 25, 1-17. 

. Bass, B. M. Situational tests. II: Variables of 
the leaderless group discussion. Educ. psychol. 
Measmt., 1951, 11, 196-207. 

4. Bass, B, M., and Coates, C. H. Forecasting offi- 
cer potential using the leaderless group discus- 
sion. J. abn. soc. Psychol., 1952, 47, 321-325. 

Bass, B. M., and Coates, C. H. Studies of lead- 
ership in ROTC. In preparation. 

6. Bass, B. M., and Wurster, C. R. Effects of com- 
pany rank on LGD performance of oil re- 
finery supervisors. J. appl. Psychol., 1953, 37, 
100-104. 

7. Fields, H. An analysis of the group oral inter- 
view. Personnel, 1951, 27, 480-486. 

. French, R. L., and Bell, B. Consistency of indi- 
vidual leadership position in small groups of 
varying membership. J. abn, soc. Psychol., 
1950, 45, 764-767. 

. Garforth, G. I. De La P. War officer selection 
boards. Occup. Psychol., 1945, 19, 97-108. 

10. Taft, R. Some correlates of the ability to make 
accurate social judgments. Ph.D, Dissertation, 
University of California: Berkeley, 1950. 

Wurster, C., and Bass, B. M. Situational tests: 
IV. Validity of leaderless group discussions 
among strangers. Educ. psychol. Measmt. In 


Press. 


w 


a) 


wm 


ll. 


THE JOURNAL or APPLIED PSYCHOLOCY 
Vol. 37, No. 2, 1953 


Effects of Company Rank 


on LGD Performance of Oil 


Refinery Supervisors * 


Bernard M. Bass and Cecil R. Wurster 


Louisiana State University 


Mandell (4), among others, has hypothe- 
sized that candidates for employment or pro- 
motion who are assessed in a leaderless group 
discussion or group oral performance test 
should be unacquainted with each other; 
otherwise “they may defer to a candidate who 
has high prestige in the group, or who has a 
higher-level position.” The primary purpose 
of this study was to investigate the extent to 
which a person’s performance in the LGD 
was influenced by his administrative rank 
outside the immediate stimulating situation. 

A number of sub-hypotheses were tested 
and a number of relationships were uncovered 
concerning the interactions between company 
rank, degree of successful leader activity in 
the LGD, rated performance by superiors as 
a supervisor, age, education, intelligence and 
knowledge and attitudes predictive of success 
in supervisory work. 

It was believed that the results would be 
of interest to those engaged in using the 
LGD to screen applicants for employment 
or promotion. They would also provide 
further information to a growing body of 
knowledge concerning leader-follower Trela- 
tions in small groups, 


Subjects 


A total of 131 supervisors at 


refinery Participated in leaderless 
cussions, 


a large oil 
group dis- 
evel main- 
2 were first 
; 18 
ird and fourth 


cuon, maintenance, 
In addition, 


neers, accountants or other te 


had no supervisory 


highly responsible technical positions wai 
called for little direct supervision. The oe 
jects ranged in age from under 30 to over 
and from sixth grade to Ph.D. in ae 
The average subject was 43 years old and 4 
high school graduate. red 

One restriction which most probably setv 3 
to severely attenuate the various relationship» 
studied was caused by the high percentalt 
of subjects who had been selected for “ 
jobs by a previously-validated battery ‘ch 
Psychological tests. A further factor bi 
probably served to restrict. the range of p 
servable differences was the large amovit 
supervisory training these subjects had a 
ceived from various formal and inform 
programs, 


Method 


Approximately 20 supervisors at @ vit 
met for a week-long supervisory training Ks p 
stam. On the fourth day, they were 8, 9 
divided into two or three groups, 6, 7) ë f 
or 10 to a group, and administered 0” otal 
four types of leaderless discussions.’ / by 
of 17 discussions was run, each observe and 
one of four trained raters. Direction? i at 
Scoring * were similar to previous studie 
Louisiana State University (e.g. 2). 


as 
" e . 1 ] aad i 
Two types of criteria of on-the-job oe and 
Supervisors were available: forced-cho! 


eatansdiae 

f "A separate report will deal with variations } 

prmance on the LGD as a function of the na 

the discussion Problem, cale 
~ The single Observer rated on a 3-point 5 n 

fois to which each participant exhibited 

fiers 7 behaviors: (1) showed initiative; 

red a (3) clearly defined problemi a6 

utions; (5) į i 

others; (7) Ted na influenced others; f 

Score was the su: 

oF group size, a 

oe Scores were adjusted according to t 

aie Thucd by participants of groups OF ch 99) 

Server L'e distribution of scores assigned by tion ( 

fae Was transformed into a sten distribu ajust? 
order to make fairly comparable al 


m assigned by the different observers. 


ime 


er 
of 


the 


a acn + ipant’sS 
e discussion, A participa” 7 


N. 


LGD Perjormance oj Oil Refinery Supervisors 101 


fee appraisals by the subjects’ superiors. 
2 page supervisory performance report 
perio pr 1950 by at least two of their su- 
the ER obtained from the records of 123 of 
ardson ne These ratings, developed by Rich- 
and equi 3cllows and Henry, Inc., had odd-even 
grou f. ivalent-form reliabilities above .90 for the 
mae ie which they were standardized. Inter- 
as ere iability was 69. Validity of the ratings 
Sheri by their tendency to differentiate 
elow usly identified, above average, average and 
or the age Supervisors ranged from .62 to .S4 
ines Various forms and departments of the re- 
ple ‘int However, for the present restricted sam- 
Most s eae agreement was only .43. Since 
indepe ENNS average appraisals were based on 
e Achar t Tatings by as many as six superiors, 
precis tUal reliability of this measure was ap- 
eciably higher. 

availa ponding graphic ratings were likewise 
le, rr for these 123 subjects. For this sam- 
nte Tater correlation was only .29. Otis 
igcchce Test scores for 87 subjects and “su- 
Were app, wtitude” test scores for 92 subjects 
lest ae available. The supervisory battery 
Scores of were an optimally weighted gn s 
of supe Performance on a forced-choice te 

force Pervisory judgment, an empirically score 
Certain noice personality inventory, and scores on 
For the wos of the Kuder Preference Record. 
Mally = Original standardizing group. the Po 
with su cighted battery of scores correlated 9< 

Superiors’ ratings of the subjects. 


Results 
atrix of intercorrela- 
ank, edu- 


Youth; intelligence, supervisory aptitude. 
Super; forced-choice appraisals and graphic 
“lot's appraisals, The first six have been 


Sto x ae 

lat ped into one cluster of highly eee 
Variab] aa a last two form ¢ 

S al aS 

eco; es while the aan 


twee, Cluster. The average corre Ba 
others 0! 


T j 
R 1 displays the m 


Cation 8 LGD scores, company " 


t 
Orrea p, Nd cluster II are a 
Mome ions reported are Pearson I Talie 
Othe at except those between rank an 
tes iables which are biserial- bs 
and | €rion ratings, supervisory battery ‘le a 
_ telligence test scores were availal 
mt ae of 
ts core Teversed in sign to make positive ae I. 
th, The o“tions of age with the variables ¢ 
thd a Ria proportion of supervisors ? he 
taps, Study, CUrth echelons of management ote 
onk Wit Y, led the investigators WC" them in 
bao Upper the other variables to combine _ 
cage With Management group of 25 cup 
Ses, One first level supervisory g 


standard score form with means of 20 and 
standard deviations of 5 for the original 
standardizing population of supervisors and 
candidates for supervisory positions. It 
should be noted that the sample used in this 
study was decidedly restricted in range on 
these significant variables. The sample mean 
was half a standard deviation higher in mean 
criterion ratings and supervisory aptitude 
scores than the original population from 
which many of its members were drawn. Re- 
strictions in range were from 12 to 58 per 
cent which severely attenuated the relation- 
ships reported. 

The first cluster of six variables had a 
mean intercorrelation of .48 while this cluster 
correlated .00 with the second cluster of two 
variables. Thus, it appeared that perform- 
ance on the LGD was highly related to com- 
pany rank (7, = .88) and to a lesser extent 
with the other variables closely associated 
with rank: education (r = .57), intelligence 
(r = 45), supervisory aptitude (7 = 30) and 
youth (r= 19). LGD performance was 
unrelated to superiors’ appraisals. Further 
indicated that the mean LGD scores 
) for subjects from the first, second 
bined third and fourth echelons of 
were 3.6, 6.7, and 6.7 respec- 
tively which according to an analysis of 
variance were significantly variant at the 1% 
level. No such significant differentiation was 
ind when all first-line maintenance super- 
visors whose mean LGD score was 3.4 were 
compared with all first-line process and pro- 
duction supervisors whose mean was 4.0. 
Staff and technical men, not included in the 
above samples, had a mean LGD score of 5.4. 
This intermediate value reflected probably 
their subordinate position compared to upper 
echelon supervisors but their superior edu- 
cation and intelligence to first-line super- 


eee rank appeared significantly re- 

d to forced-choice criterion ratings earned 
lated 34) but not to graphic ratings. Rank 
(e ae J significantly with supervisory bat- 
cor test. scores (rv = 42). In this highly 
tery ! al industry; jt was not surprising to 
ae the almost complete interdependence 
obs 


pervisory rank and education (7, = .98), 
su 


analyses 
(in stens 
and com 
supervision 


fou 


of 


102 


Bernard M. Bass and Cecil R. Wurster 


Table 1 


Intercorrelations Among Company Rank, Education, 


Intelligence, LGD Score, Supervisory Aptitude, Youth, 


Superiors’ Appraisals, and Two Clusters of Highly Intercorrelated Variables 


Cluster I 


Average 
Correlation 
with 


Cluster II 


Com- 
pany Edu- Intelli- LGD 


Super- 
visory FC 


Graphic Cluster 


Rank cation gence Score Aptitude Youth Appraisal Appraisal I 7 
aR > ; 20 
Company Rank 98* Inc. 88* 42* A wt 07 08 i0 
Education Si Sr Fi ha 32* 03 —.22t aa o4 
Intelligence A5* 56* 43° 09 —-18t at 06 
LGD Score 30* 19 —.01 EA 48 oi 
Supervisory Aptitude 29% — 01 08 a ‘07 
Youth 20} 06 33 ý 
68 
Forced-Choice Appraisal .68* -11 6 
Graphic Appraisal —.10 
Mean 12.0 20.4 4.5 22.5 42.9 22.6 22.8 
Standard Deviation 3.3 44 2.0 3.4 8.4 3.2 2.1 P. 
"pae OL. 
PP< 05. 
Inc—Not enough data on intelligence available in u 


Italicized coefficients are based on biserial r; the rer 


Not enough cases were available to obtain 
the correlations between rank and intelligence 
although it is expected that it was at least 
.50 since education and intelligence in this 
sample correlated .57. 

To see if company rank was masking any 
relationships between the other variables, 
two attempts were made to study the rela- 
tions among the other variables when rank 
was partialed out. Table 2 shows the ap- 
propriate partial correlations among LGD 
Scores, supervisory aptitude, youth and forced- 
choice and graphic appraisals. Since educa- 
tion and rank were about perfectly correlated, 


pper echelons to compute correlation. steak 
mainder are Pearson product-moment correlations: 


it was impossible to partial out one without 
eliminating the variance of the other. Qe 
When rank is partialed out, a large p s- 
centage of variance in the amount of succe 
ful leader activity is accounted for by you 
(Yo1.2 = — .70). At the same time, men 
as inadequate by their superiors on the forc 


1 — 69) 
choice performance reports (701.2 = one 
and the graphic ratings (ros = — 38) tions 
to attain high LGD scores, Correla | 


g fectel 
among the other variables remain unate, 
by partialing out rank or else are reduce 
negligible importance. 


ing 
aning 
There was some doubt about the mea 


! Table 2 
Partial Correlations — Cran Ran He Gs Youth and Superiors’ Appraisals, “ 
l o LGD Supervisory i ae 
Score : Aptitude. Youth Sop cenl Appraisa 
Sein Aptitude 16. 7 —.69 = = 
wicks Appraisal ee ~ . = 10 


Graphic Appraisal 


LGD Performance oj Oil Refinery Supervisors 


a a partal correlations. First, in order 
that Aes cao it was necessary to assume 
and the a r correlations between rank 
iaduct other variables were estimates of the 
and = enna correlations between rank 
te te variable. Second, partial r 
iven nS the amount of the relationship þe- 
fects re of measures ruling out the ef- 
of the a third when the remaining variances 
A hoir oj measures are equal. But, if 
Dose ain tank is held constant, it will im- 
memb ferent restrictions in range on each 
r oni OF A pait of variables so that partial 
ticks ides a description which usually never 

in reality. Thus, if the variances of 
GD scores could 


prced-choice appraisal and Li 
ee after company rank was held 
exist ae then a correlation of — .69 would 
his e etween them. But it 3s seldom that 
qualization of variance occurs m nature. 
eae” a second approach—purposive 
of k ing—was used to rule out the effects 
Mpany rank on the correlations between 
scores and the other variables. Table 

p shows these correlations for first level su- 
riage only and for upper level supervisors 
inte From the results in Table 3, it was 
Deri red that when rank is held constant, ex- 
pi "mentally, the correlations between LGD 
redu rmance and the other variables tend to 
ce to insignificance. This was not un- 


. N 

ee Variable ae 
2st saa eae ——— 69 
, Pervisory Aptitude 33 
wae si 
°rced-Choice Appraisal 77 


<= ‘Gene 
soc pig Appraisal 
tre — 


05, 


a First Level 
pna 


103 


upper echelon supervisors to be considered 
inadequate on-the-job, although the differ- 
ences were not significant. The less extreme 
results obtained through this sampling pro- 
cedure as compared with the partial 7 ap- 
proach was attributed to the fact that no 
attempt was made to equalize the variance 
of each two variables correlated in the upper 
and lower supervisory ranks while partial r 
forced such equalization. 


Conclusions 


The results of this study are a strong con- 
firmation of a number of common-sense ob- 
servations as well as research findings about 
the influence of a person’s rank, prestige or 
status in an organization and his tendency 
to play the role of leader in small groups of 
members from that organization even where 
there is no appointed leader for the immedi- 
ate situation. 

The biserial correlation of .88 between a 
participant’s company rank and his leader 
behavior in a supposedly initially leaderless 
discussion is consistent with a number of 
other studies. For example, Bass and Coates 
(1) found that there was a significantly 
greater increase in LGD scores on a retest 
a year after the original test by ROTC 
cadets who had been promoted to positions 
of cadet first lieutenant or higher during the 

erjod which intervened between test and 


xpect . i 
ed sin -pD score and rank corre- e 
lated 88 aes retest than the remainder (who had become 
As i ine S isors cadet second lieutenants). Similarly, Michi- 
S shown in T jst -line supervis mat i 
vi ‘ 4 nce Research studies suggt 
vith high a gai ‘ati more than BaP Confere > iggest that 
3 scores 
Table $ d Si iors’ Appraisal: 
i Youth, a uperiors Appraisals, 
Correlations Between LGD SCPC ik Held Constant by Purp s Cer 
SS with Company Re o ——— 
=e a lal Company Rank 
=a Second, Third or Fourth Level 


ee iL , r with 
ito Score N LGD Score 

“t d 6 -21 

‘09 25 02 

-12 23 —.04 

20 22 -.12 


104 


when three-man appraisal boards meet, the 
conclusions reached are those in agreement 
with the member of highest status. Also ex- 
ecutives appear to call so-called planning con- 
ferences of their subordinates mainly to ob- 
tain subordinates’ agreement on what the 
executive has already decided to do (5). 

In previous studies of unacquainted candi- 
dates or candidates of similar initial rank 
there have uniformly been reported, by at 
least 11 separate investigations, correlations 
ranging from .30 to .70 between LGD scores 
and various criteria of supervisory success 
or leadership potential. In the present study, 
these correlations were close to zero sug- 
gesting strongly that Mandell’s suspicions are 
confirmed concerning the general lack of 
validity of the LGD among acquaintances, 
especially where they differ greatly in initial 
prestige or rank. 

The theoretically significant negative cor- 
relations between LGD scores and criteria of 
supervisory success when company rank is 
partialed out statistically, pose more ques- 
tions than they answer. These include: 

1. Is one of the requirements necessary to 
be a successful first-line supervisor, the ability 
to play a subordinate role when in a social 
situation with those of higher company rank 
than he, even though they are not his im- 
mediate superiors and the situation is out- 
side plant jurisdiction? Or, on the other 
hand, are organizations discouraging com- 
munication upward from lower echelon man- 
agement as well as hindering executive de- 
velopment by appraising as inadequate those 
first-line Supervisors who give Suggestions, 
Opinions and information, who take initiative 


and show originality in their interactions with 
their superiors? 


2. Is it the youn, 
who is most consci 
to ignore it, even į 


Bernard M. Bass and Cecil R. Wurster 


3. Since active trainees, trainees who re- 
ceive and take advantage of the opportunity 
to make decisions, usually learn more than 
passive ones, to what extent are first-line su- 
pervisors handicapped when placed in con- 
ference training with upper-echelon person- 
nel? 


Summary 


LGD scores of 131 oil refinery supervisors 
were correlated with their rank in the re 
finery, their education, intelligence, “super 
visory aptitude” test scores, and supervisors 
appraisals of their on-the-job performanta 
LGD scores correlated .88 with rank, .57 H 
education; .45 with intelligence, 30 v 
supervisory aptitude and — .19 with oF 
Most of these correlations could be attribute 
to the influence of rank on all these variabli 
When rank was partialed out statistica A 
LGD scores were highly positively related ii 
age and highly negatively related to supe in 
ors’ appraisals. It was concluded that ici- 
general the LGD is not valid where pe 
pants are of known different rank. dy 
complexity of the outcomes of this an 
raise some interesting questions about ate 
validity of superiors’ appraisals as ee 
criteria of supervisory performance and b 
influence of formal rank on the behavior 


` conference participants of differing rank- 


Received May 19, 1952, 


References 


er- 

1. Bass, B. M., and Coates, C, H. , Studies of lead 
ship in ROTC. In preparation. eonalitY 
2. Bass, B. M., McGhee, C. R., et al. Porgi 
variables related to leaderless group dise 48, 


3. 
behavior. J, abnorm. soc. Psychol., 199° 
120-128, 


seed C 
3. Canfield, A. A. The “Sten” Scale—A modifie 


11, 295° 
Scale. Educ, psychol. Measmt., 1951, 
297, 


test- 
4. Mandell, M. The group oral performanc® vil 


Washington: U. S, Civil Service Comm 
1952 


; aari 

5. Conference Research, University of Menon 
Process of the Administrative COM go! 
Contract N6 onr-232, T.O.VII, March, 


Tue Journ pf fn Pi 
Vol. 37, Nee Wee PsycHoLocy 


Flesch Readability Analysis of the Major Pre-election 
Speeches of Eisenhower and Stevenson 


Arthur I. Siegel 


Institute for Research in Human Relations, Philadelphia 


and 


Estelle Siegel 


Drexel Hill, Pennsylvania 


t eon non-political point of interest during 
at Bg election campaign was the level 
ates ach each of the rival presidential candi- 
ple Was speaking. For instance, some peo- 
se Maintained that Stevenson was doing him- 
the te Injustice because he was speaking over 
5 ies of his audience, €.g., he was being 
a olar-like, pedantic, academic, formal, 

e ed, etc. On the other hand, some chided 
Dunster candidate for being a joker or a 

e lepiti In order to gain some insight into 
teadagi Y of these arguments, & Flesch 
ajor ity analysis of the texts of six of the 
D talks of Stevenson was performed. For 
the meee (control?) purposes, the texts of 
Same talks given by Eisenhower on the 
€s were also analyzed. 


Method 


T 
tiva o texts of the major talks 0 
i: uire ndidates appeared in the 
la me on the morning following the talks. 
ti © cases the newspaper saw fit to delete 
“a "Parts of the speeches of each candi- 
so avaj these cases, only the “selected text 
risen ailable for analysis. On October 28, 
Vision OWer’s major appearance wäs a tele- 
ow, in which Eisenhower answered 


Republican COM- 
P xt of the 


eg es replies to these questions was 

Originally, it was our intent t 

giv texts of the six talks of each ies 
en just prior to the election. 


a e 
sunday, November 2, issue of ih 
1d no í revious 

t contain the P ndidate 


either candi ither cand 

Ree g andidate. Nel 

tts oF Sunday, November 2- Thus, a 
€ major talks of Eisenhower # 


f each of the 
Philadelphia 


af 
Qa a 
N @s 

4 J 


105 


Stevenson on October 27, 28, 29, 30, 31, and 
November 3 were included in the present 
study. 

Flesch? recommends, as a sampling pro- 
cedure, that every third paragraph be taken 
and that the first 100 words of each sampled 
paragraph be analyzed. However, since many 
of the paragraphs of each candidate ran under 
100 words, the entire texts were analyzed. 


Results 

The results of the analysis are presented in 
Table 1. 

The “reading ease” of the texts of three of 
the major talks given by Stevenson during the 
final eight days of the campaign were classi- 
fied as “Standard” by the Flesch analysis, 
and three were classified as “Fairly difficult.” 
For the same period, the reading ease of the 
texts of four of Eisenhower’s talks were classi- 
fied as «Standard,” while one was classified 
as “Difficult,” and one was classified as 
«Fairly difñcult.” The mean reading ease 
score of Eisenhower’s speeches was “Fairly 
difficult” and of Stevenson’s was “Standard.” 
m the actual difference of only 1.5 points 
is neglible. A “Standard” style of reading 
18 is found, according to Flesch, in Digests; 
ora irly difficult” style is characteristic of 


f 
texts of Pun ad two to be “highly inter- 
inte » The styles of five of Stevenson’s 
sting: were “interesting” and one was 
Speer interesting: An “interesting” style, 
R new readability yardstick. J. appl. 
1 Flesch, 115. 32, 221-233. 


psychol., 1 


106 


A. I. Siegel and E. Siegel 


Table 1 


Flesch Reading Ease and Human Interest Scores and Descriptions for Texts of Six Pre-election 
Talks by Eisenhower and Stevenson 


Reading Ease 


Eisenhower 


Stevenson 

Date Score Description Score Description 
Oct: 27 53 Fairly difficult 61 Standard 

Oct. 28 60 Standard 57 Fairly difficult 
Oct. 29 66 Standard 59 Fairly difficult 
Oct. 30 63 Standard 66 Standard 

Oct. 31 46 Difficult 55 Fairly difficult 
Nov. 3 65 Standard 64 Standard 

Mean 58.8 Fairly difficult 60.3 Standard 

S.D. 4.1 3.8 

Human Interest 
Eisenhower Stevenson 

Date Score Description Score Description 
Oct. 27 28 Interesting 41 Interesting 

Oct. 28 39 Interesting 28 Interesting ting 
Oct. 29 51 Highly interesting 42 Highly interes 
Oct. 30 43 Highly interesting 30 Interesting 
Oct. 31 33 Interesting 39 Interesting 

Nov. 3 27 Interesting 34 Interesting 
Mean 36.8 Interesting 34.0 Interesting 

S.D. 8.5 5.6 

. F š P an 

according to Flesch, is found in the Digests, speeches more difficult to understand th a 
while a “highly interesting” style is found in 


the New Yorker. 

Thus, using 
stick, for the 
little evidence 
proached the 


the Flesch analysis as a yard- 
period investigated, we have 
to indicate that Stevenson ap- 
academic level, nor were his 


Eisenhower’s. On the other hand, ncy 
Flesch analysis, there was a slight tende- 
for Eisenhower’s speeches to be more “!? 
esting.” 


Received January 29, 1953. 
Early publication, 


THE Journ. 
vor Sopa OF, Agre PsycuoLocy 


Factorial Analysis of the Original and the Simplified Flesch 
Reading Ease Formulas ` i 


Marvin D. Dunnette 
Industrial Relations Center, 
University of Minnesota 


Ean Jenkins, Paterson, and England (5) 
Dlifie peated evidence showing that the sim- 
and ee ease formula by Farr, Jenkins, 
ment aeron (4) yields scores quite in agree- 

lesch Ali those obtained with the original 
Simplifi ope (6). The Flesch formula was 
“Wwóuld ìn order to provide a method which 
Tequire obviously be much faster and would 
Part of no knowledge of syllabification on the 

ion re analyst” (4, p. 333). The presen- 
Not ie the simplified formula, however, was 

(8) a with universal acceptance. Klare 
Jections Flesch (7) raised two principal ob- 
ime eco First, they doubted the claimed 
Ment ee of the new method. This argu- 
Stroup as met with a study by the Minnesota 

) in which a number of graduate stu- 
etermined reading ease scores by both 
The new method was found to be 


y on to the new 


and 


Paul W. Maloney * 


The Addison Lewis Co., 
Minneapolis 


curacy of the two counting methods. Other 
aspects of reading ease calculation were also 
investigated. 
Method 

Untrained ° subjects for the experiment in- 
cluded 72 male and 72 female freshman stu- 
dents. All were enrolled in freshman English 
in the School of Agriculture and Home Eco- 
nomics at the University of Minnesota.* 
These students had never been trained in 
and probably had never heard of the tech- 
niques employed in making readability analy- 

S. 
> Since the number of syllables is inversely 
related to the number of one syllable words, 
it became necessary to control the diffculty of 
the test material." Further, it was felt that 
ability to perform readability counts may be 
related to a person’s reading ability. r Because 
of this, the subjects were grouped into four 


relatively homogeneous groups on the basis 
ragraph comprehension scores on 


their Pae 
Methog vor second objecti F 
od i 2 z their pa 2 j 
been un, is more formidable, and has as Me pés heiison-Denny Reading Test. The time 
Countin answered. The argument states t $ fren to perform the counts was also -meas- 
than ree) One syllable words is less accurate red, Subjects within each group were then 
for 4, Cunting syllables. The logical basis © ndomly assigned to conditions imposed by 
involves Position is sound—syllable ees i following factorial design: 
attentive study of each word, where- 
7 . ‘lity Level Group 
y Reading Ability i 
Dieu À 7 15050% 50%-15% Upper 25% 
ae F Lower 25% 
$y ount 
Ceria Syllables + 
ediy O 5 Words 
m 
Dilerin Syllables 
M, ult O S Words 
Taterial Syllables 
5 Words i 4) have stated th: 
as a : QS won ds 3 Farr, Jenkins and Pig remove the nai oa 
mig «Person Picking out one syllable wor their Soa the Flesch formula and make it more 
Rat JUSt Scan the plexity from ctieat men in thelr daily work. | Be- 
s n the passage. the useful is, we felt it would be most desirable to 
West; Study was designed to bear 0n cause 0  intrained subjects who were not ori- 
lon: à , 8 de of the ac- use naive OF of either method of counting. 
4 a Rat 4 comparison was ma f ented 1n favor f cih to express Hanketo Professor 
Utati Ctorial desi $ d the basic com- 4 The autho 5 ish professor James T Brown who 
ang tongs = design experiments an rds (3) G. Nichols am ir department. 
dq h thei 
Nes involy : d by Edw? Ralp! operation of pi : 
lag Porson. 9 ea ered + Re offered the cophterial was carefully chosen so as to 
Nong gery research assistant in the Industrial 5 The tes 
enter, 107 


108 Marvin D. Dunnette 

A test form was developed for each of these 
six conditions. The first page of each was a 
simple explanation of the subject’s task. Ex- 
planation was facilitated by the use of two 
short examples which were used for all six 
conditions. The second page included a 50- 
word practice passage of the same difficulty 
as the test passage. The third page was de- 
voted to two test passages of 100 words each. 
The subjects were instructed to perform the 
proper count separately for each passage and 
record their answers in the spaces provided. 

Oral instructions were framed to emphasize 
accuracy over speed, Subjects were asked to 
check their completed work, and then to re- 
cord the letter appearing on the blackboard. 
A new letter was placed on the blackboard 
every 10 seconds, thus providing time scores 
without emphasizing the speed factor. 

Four factorial designs as shown above were 
used in the experiment to include information 
separately for male and female subjects, and 
for error and time scores. Since 72 subjects 
of each sex were available, plans provided for 
three replications in each cell, Several stu- 
dents were absent on the day of the adminis- 
tration. These were immediately sent a test 
form by mail. In all, 24 students were ab- 
sent; 20 were located; 16 returned completed 
test forms. These mailed returns, of course, 
did not include a time score. It was, there- 
fore, necessary to reduce the number of repli- 
cations in the two time-score designs to two 


per cell. Thus, error data for five boys and 
three girls were missing. 


Omogeneity of variance was tested by 
means of Bartlett’s Test (3). Inno case was 


Chi-square sufficiently large to Teject the hy- 
tire range of difficulty, 


Cover the eni “Easy” regi 
a! - 

tered above 90; “medium” in the 40's; “dificult es 
Passages selected regi = 


low 10. i 
3 gistered n 
(Æ 5) RE scores on both formulas, early tania 


and Paul W. Maloney 


pothesis of equal variances. Plotting the d 
indicated the distributions to be sone 
and platykurtic. Since skewness is the mos 

serious deviation from normality (for pur- 
poses of variance analysis), and since Cochran 
(1) has shown that non-normality does et 
seriously alter conclusions derived from = 
ance analysis, no test was made of the = 
sumption of normally distributed parent pop" 

lation. 


Results 


Table 1 shows the results of the analysis a 
variance. It is seen that some of the goua 
of variation are significantly different fro! 
the variation due to error. These results om 
be interpreted more easily, however, by re = 
ring to Table 2 which gives the means e 
standard deviations for accuracy and time u 
quired. J 

Both boys and girls performed the one Sa 
lable word count more accurately than ce 
syllable count. For the boys, the aa 
was significant at the 5 per cent level. ta- 
the girls, however, the difference was not a 
tistically significant. Both boys and girls & er 
did the one syllable word count in 25 eh 
cent less time than that required for te a 
lable count. The differences were statistica 
Significant. for- 

These findings suggest that the new ac- 
mula is superior with respect to both the 
curacy and time required to perform jliat 
Counts. Since the subjects were unfam ad- 
with the counting methods required by rence 
ability formulas, we can state with assut in 
that the F, J, and P simplified formula E G 
herently easier to perform. This finding, “nat 
bined with previous evidence (5), shows ap’ 
persons will perform the new count more 1% f 
idly and with greater accuracy regardless 
the degree of their skill. 


€ snterat 
_ Data in Table 1 show a significant inte i 
tion for boys between type of count ane sr- 
culty of m 


aterial. Table 3 shows this 1” ple 
action effect more clearly. The one syl ot 
Word count was Jess accurately performed 2 
easy material but was more accurately Pine 
formed for difficult material, Expressing ý 
error as a percentage may cover up the a is 
tical Significance of these differences. TD 


109 


Factorial Analysis of Flesch Reading Ease Formulas 


80¢ oT 
a oe ns ea wi 88°6 ort z 766 we- 4 moga 
eor oor oT 7a er 3 OS TT tot y s0'8 890+ [44 wn pINW 
Tor 912 Me 36 612 2 L49°¢ sre= Ez LV9 6 e— @ Aseq 
ozr 997 eS ee one x STITI > i ye 916 80'0+ se MSO 
= - = ss, sot— se 489 Lyy— TE SatqeyiAs 
as ¢ ao) N as (598) N Ts pm N “as zong 
sae et wD bg quan dA x 
VIN WW 
spp sog sm9 skog 
AMIYA ory JO sfeuzeJY 10} pur Zujunos jo SPom JUNY Joy IUIL, uvay PUL 10114 ywuoD og uray 
@ AQL 
"IBAI Judd Jad 7 IP Ye JuRdGIUAIIS yg 
. “ADT Jad jad G IY Ye yULdPWRIS y 
= A iA = g L'89£6 L44 — yv06 Sp — LETO €y 0u 
E a €8h 9 Ww SL'6L0¥ 9 sv 760 9 09° LO9E 9 yunog X AMAY X AMA 
: 889% £ 99° ee tog £ yS 69°SL £ is cele € qunog X Away 
co" Riser [4 see So'LesTe T 107 88`IST t »++£0'9 99'TLE T yunog X AMPA 
sg L0°ST98 9 est LVESET 9 6Cr evszt 9 L16 186S 9 ANAY X AORA 
aea P iL “ee ae i st TUT I «LBS Of 7Z9E i qunog jo sadky, waaajag: 
E Brak $ ea € Oot SPILT € Let Tt £ sjaaT Aynqy Bupray uanpg 
E #IVE STOS6TE T SEE Ł9`E01 A ILAA OL'9ST g spay Aymaq upg 
Ri SW VP i Sw y/p a ‘SIN yp a SIN yp uoneurg JO amog 
spo skog sm9 skog 
WYLL IWJ, 


10u 709 4 


sisÁjeuy Pueueg jo uoneddy əy} Jo s}jnsoy 


T BL 


110 


Table 3 


Mean Per Cent Error Made by Boys as Related 
to Difficulty Level and Type of Count 


Mean Per Cent Error 


Difficulty One Syl- 
of Material Syllables lable Words 
Easy —0.55 —1.67 
Medium —0.58 0.90 
Difficult —3.38 0.76 


1 Data are not included for girls since for them the 
effect was not statistically significant. 


because there are more total syllables than 
one syllable words in any given reading pas- 
sage. Therefore, because RE scores depend 
on the absolute number of units counted, a 
given variation in reading ease score reflects 
a larger per cent error in the one syllable 
word count than in the syllable count. This 
means that a greater per cent error is toler- 


ated by the simplified formula than by the 
original formula. 


Discussion 


The findings do not relate to the degree of 
agreement between scores derived via the two 
formulas. They relate instead to whether or 
not the new formula is operationally a simpli- 
fied version of the old or whether or not it is 
simplified in name only. The results suggest 
that the revised formula is superior with re- 
spect both to time taken and accuracy with 
which it is applied. The use, in this study, 
of Untrained subjects shows that this su- 
Perlority does not depend on training or previ- 


Ous experience but resides instead in the dif- 
ferent method of counting required by the 
new formula. This formula therefore, ap- 


plified version 


Summary 
A factorial e 
study the effect: 


perform teadability counts, Th i 
vestigated were: (1) difficult ea 


Marvin D. Dunnette and Paul W. Maloney 


terial; (2) the type of count performed; ©) 
reading ability of persons performing the 
counts; and (4) sex. : í 

The major finding was that the counting A 
one syllable words could be done in abou 
three-fourths the time required for countea 
syllables. Boys performed the former ge 
more accurately than the syllable count. T F 
difference was not statistically significan 
among the girls. 

A TEE ENA interaction effect was foun 
between difficulty level and type of conny 
The syllable count was performed more Be 
curately for easy material; the one slan 
word count was performed more accurate 
for difficult material. Neither accuracy ath 
time taken was significantly associated W! 
reading ability or sex. J 

It has been concluded that the new F, i 
and P formula is truly simplified since it A $ 
be applied with a greater degree of accuracy 
and requires less counting time. 


Received February 24, 1953. 
Early publication. 


References 


1. Cochran, W. G. Some consequences when the +t 
sumptions for the analysis of variance are 
satisfied. Biometrics, 1947, 3, 22-38. | 

2. Cochran, W. G., and Cox, G. M. Experi 
designs. New York: Wiley, 1950. _ cho- 

3. Edwards, A. L, Experimental design in ee 0. 
logical research. New York: Rinehart, G. 

4. Farr, J. N., Jenkins, J. J., and Paterson, mula 
Simplification of Flesch reading ease f0" 
J. appl. Psychol., 1951, 35, 333-337- 


ental 


and 
5. Farr, J. N., Jenkins, J. J., Paterson, D. Flesch 
England, G. W. Reply to Klare and o for- 


re “Simplification of Flesch reading Sn, 
mula” J, appl. Psychol, 1952, 36, 55-5% pl. 
- Flesch, R. A new readability yardstick. 
Psychol., 1948, 32, 221-233. Flesch 
s Flesch, R. Reply to “Simplification of F957, 
reading ease formula.” J, appl. Psychols 
36, 54-55, red 
- Klare, G. R. A note on “Simplification of 


52) 
reading ease formula.” J. appl. Psychol» R 


i + tn {ndt 
i Nelson, C. W. Use of factorial design in ‘tech 
trial relations research. Research and ta S 
nical Report 6, University of Minnesot? 3; 


dustrial Relations Center, Dubuque, 
Wm. C. Brown Company, 1950. Pp. 52+ 


, 53 


SS ees 


Tne JOURNAL OF Al p x 
Vol 37 Ne oe PsycHoLocy 


Reliability of the Original and the Simplified Flesch Reading 
Ease Formulas 


George W. England, M 


argaret Thomas, and Donald G. Paterson 


University of Minnesota * 


Both Klare (7) and Flesch (5) in their at- 
tacks on the Farr, Jenkins, and Paterson (2) 
Simplification of the Flesch reading ease for- 
mula (4) suggested that the reliability of the 
F, J, and P simplification formula would be 
lowered. Klare asserted that the simpler 
method would magnify each counting error 
and thus decrease reliability. Flesch attacked 
me idea that r between the original and the 
Simplified reading ease scores would be higher 

or more heterogeneous materials * than for 
the employee handbooks used in developing 
the F, J, and P simplified formula and, in ef- 
fect, implied that the reliability of the Flesch 

ormula is impaired by the F, J, and P simpli- 

ed formula, 
tae Fate 
Critics e to answer the 
edge nis With respect to t px 
M syllabification required for o A 
Which ae also with respect to the speed W 
Dlieg © new, simplified formula can be A 
a A à mean time of 82 seconds versus 1 
Ple g of 65 seconds per 100-word an 
reliabili found. Discussion of the problem r 
with PA however, was necessarily postpor 5 
Boing e following statement, “A thoroug 
Woulg pic of the reliability of both methods 
S6), “e needed to settle this issue” ($, P- 


and England (3) 
Klare and Flesch 
he relative knowl- 


The 
: ct of 
the Problem nt paper reports on this asP° 


the Industrial 

raduate stuz 
professor o 
À ies 
ma Rees and member of the sta age 
of oe. ations Center at the time the won Staff 
tog CA oe is now with Personnel Resco authors 
Mp ue i amden, N. J., and the at er niversitY P 


idi . tero- 
neo v idea that y would be lowered if more hac ally 
“aking Materials were used is naive Be 


Procedure 


Data from House Organs. During the 
spring quarter of 1952, 13 pairs of analysts 
computed reading ease scores by both for- 
mulas for each of 196 hundred-word samples 
drawn from 49 house publications.” Most of 
these analysts had participated during the 
winter quarter of 1952 in the prior study of 
the time required to compute reading ease 
scores by the Flesch method and by the F, J, 
and P simplified method. Stress was now 
placed on accuracy of counting syllables, one- 
syllable words, and sentence length as well as 
in the use of the Farr and Jenkins table (1) 
and the Farr, Jenkins, and Paterson table (2). 
One member of each pair used the old for- 
mula and the new formula in analyzing a 
given hundred-word sample and the other 
member of each pair did likewise for the 
same hundred-word sample. The 14 analysts 
formed 13 pairs and each pair, on the aver- 
age, analyzed about 15 hundred-word sam- 
ples. It is recognized that this procedure 
would produce lower reliability coefficients 
than would have been the case if one pair of 
experienced analysts had analyzed all 196 
samples. It was anticipated, however, that 
the emphasis on accuracy of all the operations 
would tend to produce acceptable reliability 


data. 
Data from Books. One analyst? under- 
took to compute reading ease scores by both 


nts in Mr. Paterson’s Seminar in 
ied Psychology participated in the study. The 
as done under the immediate supervision of 
W. England who also assumed responsibility 
the preparation of the statistical constants and 
for ry coefficients. The writers are grateful 
the following students: Robert C. Becker, Sarah 
to h Cook, Ellen A. Corcoran, George W. England, 
Ruth Cooks Epe, Richard S. Hatch, Sulo N. Havu- 
njamin Lasoff, Raymond C. Lee, Jr., Paul 

m Maloney, Ernest L. McCollum, Arthur C. Mc- 
W. Charles Newstrom, and Margaret Thomas, 
Kinney) aret Thomas conducted this phase of the 


2 Graduate stude: 


111 


112 


George W. England, Margaret Thomas, and Donald G. Paterson 


Table 1 


Means, Standard Deviation: € y st S ; of the Flesch 
iati iability ficients for Analyst to Analyst Study of th 
riations and Reliability Coefficients for Analy ys $ y iden 
j and pee Jenkins, and Paterson Simplified Reading Ease Formulas Applied to eae <a by 1 
Note N = 196 hundeed:wvord samples drawn from 49 House Organs with counts and computations 
D :N tions made by 3 


pairs of analysts. 


r 
alys Analyst 1 
Analyst 1 Analyst 2 ( SEAP 
Mean S.D. Mean S.D. Analyst 2) 
7. 90 
Sentence Length 20.3 7.0 20.3 k z ee 
Syllable Length 159.6 15.3 158.7 = be 
No. of One-Syllable Words 62.4 7.6 62.4 ° pe 
Flesch R. E. Score 51.2 16.1 51.7 z a 3 
F, J and P R. E. Score 48.0 14.1 47.8 14. 


formulas for each of 196 hundred-word sam- 
ples drawn from 28 books. Then, at a later 
date, this same analyst recomputed the data 
for 77 of the 196 samples drawn from 11 
books. In this way, a basis was provided for 


computing test-retest reliability coefficients 
for the 77 samples. 


Results 


Data jrom House Organs. The statistical 
Constants and the reliability coefficients 4 are 
presented in Table 1. It will be noted that 
the means and sigmas obtained by each of a 
pair of analysts are quite close. The relia- 
bility coefficients shown in column 4 of Ta- 
ble 1 are all .90 or higher. As was true in 
the Hayes, Jenkins, and Walker study (6), 
the reliability of computing average sentence 
lengths per hundred-word sample is lower 
than for making the syllable counts. Further- 
more, the evidence shows that total syllable 
counts and counting the number of one syl- 

per hundred-word Samples are 


lable words 
made with a Sratifyingly high degree of relia- 
nd .95 respec 


tively). The relia- 


and .93 respectively), 


efficients compare favorab i 


in Table 1 
: “alternate form reliabili may be 
ereas i eliability coefficienton 
Gate hey are ‘analyst to analyst” reliability cats 


ported by Hayes, Jenkins, and Walker H 
for the original Flesch formula. Lov a 
real loss in reliability has arisen by the rr 
duction and use of the F, J, and P simpli 
formula, ts 
Data from Books. The statistical ge e 
and the reliability coefficients for 2 the 
analyst are presented in Table 2. ARE 
means and sigmas of the first count Or t of 
putation (test) and of the second coun are 
computation (retest) by this analyst r, is 
quite close. Of more importance, howev® sles 
the fact that these hundred-word ee 
drawn from 11 books represent far the 
heterogeneous materials than was true © The 
samples drawn from the house organs. h 
sigmas in Table 2 when compared gi oint: 
sigmas in Table 1 clearly prove this a ease 
The range of the original Flesch ge for 
Scores for these 11 books was from en 
“Fun with Dick and Jane” to 26 for, As 
sonality, a Psychological Inter pretat en 
a matter of fact, the 11 books were mee 
from all the difficulty levels. And, as oefli- 
be expected, the test-retest reliability © ap” 
cients are much higher. In fact, they jned 
Proach unity. This is due to the com nae 
Operation of the greater heterogeneity ° 


ngle 
; sing 
terials sampled and having only es and 
“compulsive” analyst make all coun oxi 
computations, 


The results closely anil 
mate the high analyst to analyst relia? og 


i kins, a 
Coefficients reported by Hayes, Jen 
Walker (6). 


te 


a 


Reliability of Flesch Reading Ease Formulas 113 


Table 2 


Means, Standard Deviations and Test-Retest Reliability Coefficients for a Single Analyst Study of the 
Flesch and the Farr, Jenkins, and Paterson Simplified Reading Ease Formulas Applied to Books 
Note: N = 77 hundred-word samples drawn from 11 books with all counts and computations made by a single 


analyst. 


First Count or 


Second Count or 


Computation Computation 

(Test) (Retest) Test 

Retest 
Mean S.D. Mean S.D. r 
Sentence Length 19.8 10.6 20.0 10.7 95 
Syllable Length 146.6 19.7 146.7 19.6 99 
No. of One-Syllable Words 69.6 8.5 69.4 8.7 99 
Flesch R. E. Score 62.4 24.7 61.7 24.4 -99 
F, J, and P R. E. Score 59.0 21.6 58.7 21.7 97 


Intercorrelations between Original and 
Simplified Formulas 


Two intercorrelations between the original 
and simplified RE scores for 196 samples from 
49 house publications were computed: (a) 
for analyst 1, r was +.84; and (b) for 
analyst 2, r was + .87. The intercorrelation 
between the original and simplified RE scores 
for 196 samples from 28 books for a single 
analyst was +.94 and for the 77 samples 
from 11 books for the same single analyst, r 
was .97. The intercorrelation for the original 
and simplified RE scores for the averages of 
the 28 books (7 one-hundred word samples 
each) was + .97. Thus, the original and the 
simplified RE scores are comparable when 
computed by a single, fairly experienced and 
compulsive analyst. 


Summary 


The reliability of the original and the sim- 
plified Flesch reading ease formula based on 
(a) samples drawn from house organs, using 
13 pairs of relatively inexperienced analysts; 
and (b) samples drawn from books, using a 
Single, more experienced analyst is reported. 
The findings confirm the earlier reliability 
Study by Hayes, Jenkins, and Walker (6) 
and show that both the original and the sim- 


plified Flesch reading ease formulas are highly 
reliable. With Aeterogencous materials and 
a single “compulsive” analyst, test-retest re- 
liability coefficients from + .95 to + .99 
were obtained. Intercorrelations between the 
original and simplified formulas are likewise 
“high.” 


Received February 10, 1953. 
Early publication. 


References 


. Farr, J. N., and Jenkins, J. J. Tables for use 
with the Flesch readability formulas. J. appl. 
Psychol., 1949, 33, 275-278. 

. Farr, J. N., Jenkins, J. J., and Paterson, D. G. 
Simplification of Flesch reading ease formula. 
J. appl. Psychol., 1951, 35, 333-337. 

. Farr, J. N., Jenkins, J. J., Paterson, D. G., and 
England, G. W. Reply to Klare and Flesch 
re “Simplification of Flesch reading ease for- 
mula.” J, appl. Psychol., 1952, 36, 55-57. 

Flesch, R. A new readability yardstick. J. appl. 
Psychol., 1948, 32, 221-233. 

. Flesch, R. Reply to “Simplification of Flesch 

reading ease formula.” J. appl. Psychol., 1952, 
36, 54-55. 

- Hayes, Patricia M., Jenkins, J. J., and Walker, 
B. J. Reliability of the Flesch readability 
formulas. J. appl. Psychol., 1950, 34, 22-26. 

. Klare, G. R. A note on “Simplification of Flesch 


reading ease formula.” J. appl. Psychol., 1952, 
36, 53. 


wn 


~ 


Tue Journat or APPLIED Psycuotocy 
Vol. 37, No. 2, 1953 


Validity of Readability Formulas * 


Charles E. Swanson 


Institute of Communications Research, 
University of Illinois 


Whether readability formulas can be used 
to predict more or less success in all printed 
communication is not known. 
suggest their formulas will discriminate among 
articles expected to get more or less reader- 
ship, understanding, etc. Dale and Chall (1) 
use the title, A formula for predicting read- 
ability. Flesch (2) says more readable writ- 
ing will “appeal to readers” and cites an ex- 
periment by Swanson (9) where readership 
was the criterion. However, the excellent 
bibliographies of Hotchkiss and Paterson (5), 
Flesch (2) and Klare (6) show few valida- 
tion studies using comprehension and reten- 
tion as criteria. 

In their pioneering study Gray and Leary 
(4) found 24 factors of style related to 
reading comprehension of adults, Gray and 

Leary, Dale and Chall, and Flesch reduced 
these to a few factors. Their findings agreed 
on word difficulty and sentence length. 
Flesch also used personal references in one of 
his formulas. 

In two experiments with articles in a mid- 
western farm paper Ludwig (7) varied one 
factor at a time, word difficulty and personal 
references. His test articles were each read 
by more than 40 per cent of the two samples 
of farmers. Readership differences between 
the experimental pairs of articles were small 
and were not significant. 

Analysis of Ludwig’s findings suggested 
several hypotheses: 

Readability factors would have maximum 
effect when two or more positively related 

* Grateful acknowledgment is made to the Gradu- 
ate School, University of Minnesota, for the research 


grant to finance preliminary analysis, field work, and 
part of analysis of Tesults. Intensive analysis of 
Were supported under 
R-246, T.O. 4, Office of Naval Re- 
i as responsible investi- 
gator. Aid was provided by staffs of the Industrial 
Relations Center and Research Division, School of 
Journalism, University of Minnesota, and by Drs, 
James J. Jenkins and Robert L. Jones. The senior 
author is indebted to Dr. George M, Klare, Univer- 
sity of INinois, for his critique of the analysis, 


Some authors - 


Harland G. Fox 


Industrial Relations Center, 
University of Minnesota 


and 


factors were varied. Easier words and shorter 
sentences, for example, should result in in- 
creases of comprehension, other things being 
equal. ; 

Where more than 40 per cent of an audi- 
ence selects and reads an article, less gains 
in effect can be expected from impr oyaa 
readability. Also, where lesser propor a 
of an audience read an article, the more ae 
gains may come from increases in readabili k 

Motivational factors inherent in cone 
such as subject matter, probably are ae 
important, generally, then readability W. f 
individuals select what they want to re 
and learn from printed media. For ane 
comic strips are easy to read ee be: 
widely in readership, or audience me 
One comic strip may reach 70 per pte ape! 
another strip in the same day’s newsPé 
reach 20 per cent of the same ages 

Readability factors might be noA 
portant than motivational factors whe an 
dividuals are required to read and stu Ja be 
are tested on their learning. This kL tions 
the case in classroom and training situa 


im- 


The Present Experiment 


sons 0 
In this study easier and harder vee of 
12 articles were published in three a mid- 
a paper sent monthly to employees i! peared 
western company. Four articles = n- 
each month. The 296 employees We", ind 
domized into two groups, “easy sam 
“difficult sample.” tac of the news 
Easy sample received copies © 12 article? 
paper with easier versions of the ewspaP® 
Difficult sample received the same ” ‘+ 
with the harder versions. aby. pro d 
The 12 articles concerned comp rami an 
ucts, company history, safety ieee age” 
the working agreement which cover y 
hours, and working conditions. mined n 
Effects of the versions were deter gure e, 
these criteria: (1) Retention; mea uesti?” 
a 43-item test of multiple-choice 
114 


5. 


Validity of Readability Formulas 


based on the 12 articles; (2) Readership; 
Measured on easier or harder versions of two 
articles; and (3) Comprehension; measured 
by a 10-item test given before and after ex- 
Posure to easier or harder versions of two 
articles, 

y Four other instruments were used. They 
Involved general opinions about company and 
union, general satisfaction with one’s job 
(11), Sanford’s authoritarian-equalitarian 
scale (8), and Goossen’s disguised intelligence 
test (3 Ji 

_ Each subject followed this sequence in an 
Interview: (1) Took the 43-item information 
test; (2) Read easier or harder versions of 
two articles; (3) Reported whether he had 
tead the two articles when they appeared in 
the company newspaper. (Sixty per cent had 
read the articles. Actually these subjects 
Were reading the articles a second time in the 
Comprehension test.); (4) Took 10-item in- 
formation test on the two articles. (The 10 
items were included in the 43-item test.); 
and (5) Answered four questionnaires on 
general opinions about company, union and 
job, authoritarian-equalitarian personality 
aspects, and intellectual ability. 


Readability Differences 


Three questions concern changes from 
harder to easier versions. What were the 
differences in readability? Could some fac- 
tors decrease comprehension and so decrease 
Positive effects of other factors? Did easier 
and harder versions have the same informa- 
tion content? 

The following readability differences ap- 
peared. 

Formula scores. By the Flesch formula, 
the easier versions had a mean score of 73 
(fairly easy) whereas the harder versions 
scored an average of 59 (fairly difficult). 
The Dale-Chall formula gave similar results. 
The easier versions had a mean Dale-Chall 
Score of 7th-8th grade compared with a score 
Of 11th-12th grade for the harder versions. 

Number of words. The easier versions had 
fewer words, an average of 284, while the 

arder versions had an average of 332 words. 

he easier versions totaled 3,410 words and 
the harder versions totaled 3,983 words. 


115 


Flesch human interest index. The easier 
versions had a mean score of 46 (very in- 
teresting) and the harder versions a mean 
of 17 (mildly interesting). 

Sentence length. The easier versions had 
an average sentence length of 13.0 words. 
The harder versions had an average sentence 
length of 19.4 words. 

Syllables per 100 words. The easier ver- 
sions had 142 syllables per 100 words and the 
harder versions 161 syllables per 100 words. 

Unfamiliar words. As scored by the Dale- 
Chall list of 3,000 unfamiliar words, the 
easier versions had 11.6 per cent unfamiliar 
words whereas the harder versions had 20 per 
cent unfamiliar words. 

Verbs and adjectives. The easier versions 
had 130 verbs per 100 adjectives. The harder 
versions had 89 verbs per 100 adjectives. 

This study could not answer the question 
of whether some of these or other readability 
factors cancelled out comprehension gains. 
No previous research had been published at 
the time of designing the study to suggest 
this possibility. The Gray-Leary and Swan- 
son investigations indicated that readability 
factors such as these would combine for posi- 
tive effects. 

In the opinion of three judges the informa- 
tion content of the easier and harder ver- 
sions was the same. They used the multiple- 
choice questions as aids to their judgments. 
No method was known to the investigators 
by which information content could be classi- 
fied and its similarity between easier and 
harder versions defined quantitatively. 

Differences in effects of easier or harder 
versions could not be attributed to subject 
matter. 

Whether differences could be attributed to 
fewer words used in easier versions is a ques- 
tion of whether details were amplified in the 
harder and longer versions. Wilson (10) 
used versions 300, 600 and 1,200 words in 
length. She found that amplification was 
helpful only where the reader had difficulty 
with concepts. Any advantage in this re- 
spect might be in favor of the longer versions. 
However, the investigators believed that the 
information content and amount of ampli- 
fication were held constant. Again, no quan- 


116 


titative method was devised to permit other 
investigators to check this point. 


Characteristics of the Two Samples 


The two samples were interviewed under 
the same conditions by the same group of 
interviewers in the company’s dining hall. 
A total of 130 interviews was completed (67 
easy sample and 63 difficult sample). 

Attrition of the original population of 296 
employees was due to several factors. A 
total of 96 were “laid off”; 6 quit and the 
remainder were on night shift or vacation or 
were ill or were illiterate. No significant 
differences between the samples could be at- 
tributed to these factors. 

The two samples did not differ significantly 
on the following social characteristics: 

Social and individual. Age, sex, years of 
schooling, marital status, mean scores on the 
authoritarian-equalitarian scale, and the 
Goossen disguised intelligence test. 

Job and union. Years with the company, 
years in current job, union membership, years 
in the union, readership of a union paper, and 
opinions about company, union, and job. 

Easy sample had more employees high and 
low in intellectual ability as measured by the 
Goossen disguised intelligence test. 
test scores, however, 
cantly, 

Compared with the general population of 
American adults, these two samples of 128 


Mean 
did not differ signifi- 


employees included more females (60 per 


cent); younger persons (36 per cent from 
20 to 30); more schooling (55 per cent with 
some high school and 13 per cent with some 
college). 

More than 60 per cent had worked more 
than five years for this firm and 65 per cent 
were union members. Of those who were 
union members two-thirds had been union 
members more than five years, 


Results 


Retention. The two sa 
Significantly in mean sco 
test based on information 
the 12 articles, 

Item analysis show 
not differ significant] 


mples did not differ 
res on the 43-item 
in both versions of 


ed the two samples did 
y on 37 of the 43 in- 


Charles E. Swanson and Harland G. Fox 


formation questions. Of six items where 
differences were significant, easy sample had 
higher scores on two and difficult sample had 
higher scores on four questions. : 

No consistent patterns appeared in kinds 
of items on which one sample succeeded mor 
than the other. Easy sample had higher 
scores on questions about annual sick leave 
and a provision of the working agreement. 
Difficult sample had more success on pn 
items about company history, the “cause 0 
hard water, and the name of an official who 
bargained with the union. 

The two samples did not differ in remem- 
bering information from easier or harder 
articles. The formulas did not appear it 
measure factors in the articles related to di 
ferences in retention. ‘i t 

Readership. The two samples -did a / 
differ significantly in readership. Of ber, 
sample, 65 per cent (7 = 67) read bos 
articles; of difficult sample 61 per © 
(n = 63) read both articles. rae 

The easier versions did not reduce the P í 
portion who failed to read both ar ape her 
easy sample 22 per cent did not read ¢! D 
article and 29 per cent of difficult sa! 
did not read either article. the 

Neither the readability formulas 70 as- 
Flesch human interest index seemed to wife 
ure factors in the articles related to «i 
ences in readership. a 

Comprehension, Easier and harder Me 
sions of the two articles used in the test 
readership also were used to test eae 
hension, Subjects read easy or difficult ion; 
Sions of the two articles in the test situat! 10 
immediately after reading they answer mese 
questions based on the two articles. ; itial 
10 questions had been included in the item 
43-item test. Mean scores on this test 
test, before and after reading in the 
Situation, are shown in Table 1. 

_Easy sample did significantly better 
difficult Sample on the 10-item after-re 
test. _owever, the two samples di test 
differ significantly on the before-readiné -i 


i 
’S result indicates that the readal es 
formulas q 


: 3 r 
id measure factors in the # per 


Which related to differences in compt© 
sion. 


tha” 
ing 
d K 


that 
An analysis of the 10 items showed 


Validity of Readability Formulas ii 
Table 1 
Mean and Variance Significance Tests for Sample Easy and Sample Difficult on Information Tests 
Note: Sample Easy V = 67; Sample Difficult V = 63. 
Mean Mean 
Sample Sample S.D. S.D. 
Easy Difficult Sample Sample 
n= 67 n= 63 Easy Difficult F t 
n ne = 
Variables 
43-Item Information Test 20.93 22.29 5.96 5.54 1.16 1.34 
10-Item Test Before Reading 5.25 4.87 1.82 1.78 1.04 1.37 
10-Item Test After Reading 8.03 7.16 1.91 1.85 1.06 2.61** 
Gains in Correct Response on 
10-Item Test After Reading 2.78 2.30 1.72 Aca?’ 1.06 1.54 
** Significant at the 1% level. 
easy sample made consistent gains in com- showed 11 significant differences. In each 


Prehension over difficult sample. None of 
the gains appeared important except for one 
item. On this question easy sample showed 
four times as high a gain (54 per cent) in 
Correct responses as difficult sample (13 per 
cent). 

The evidence, both qualitative and quan- 
titative, showed that readability indices could 
be used to predict differences in comprehen- 
sion between two versions of the same ma- 
terial. 

Readers vs. non-readers. In the reader- 
ship test of two articles, about two-thirds of 
the 128 employees read both articles. The 
remainder either ignored both items or read 
one. Would readers have higher information 
scores than non-readers? Obviously, much 
information could have been learned from 
personal experience or other sources. Yet 
one might expect readers to know more; the 
reading behavior might be symptomatic of 
efforts to learn similar information from 
other sources. 

By the reading criterion, subjects were 
divided into three groups: 80 who had read 
both test articles in the company newspaper; 
15 who had read one; 33 who had read 
neither, The two extreme groups, readers 
and non-readers, were compared. 

On the 43-item information test the read- 
ers had a mean of 23.5 items, or 55 per cent, 
Correct. Non-readers had a mean of 18.5, 
Or 43 per cent, correct. This difference was 
highly significant (f = 5.11). 

Item analysis (by reader and non-reader) 


case readers were more successful. 

Of the remaining 32 items readers had a 
higher proportion of correct response on 29 
items. By the sign test, this was a highly 
significant difference. 

Readers had significantly higher mean 
scores than non-readers on the 10-item test 
before but not after reading the two test 
articles. The non-readers gained more in 
comprehension. From before to after read- 
ing, the non-readers gained on the 10 items 
an average of 2.9 items correct, compared 
with 2.2 for readers. This was a significant 
difference (¢ = 2.00). 

Whether readers had more intellectual 
ability than non-readers became an important 
question. They did not differ in years of 
schooling or in Goossen disguised intelligence 
test scores. This suggested that readers might 
differ from non-readers on other social char- 
acteristics which could explain differences in 
motivation, or interest in the material. 

Ten factors were analyzed for clues to 
differences in motivation between readers and 
non-readers. These were age, sex, years 
with the company, years on the specific job, 
union membership, years in the union, read- 
ership of a union paper, general opinions 
about company and job, authoritarian-equa- 
litarian score, and union activity. Readers 
and non-readers did not differ on these fac- 
tors. No characteristic discriminated þe- 
tween those employees more and less mo- 
tivated to read and learn information from 
the company newspaper. 


118 


Charles E. Swanson and Harland G. Fox 


Table 2 


i igni x Information Tests 
d Variance Significance Tests for Readers and Non Readers on 
cia Note: Readers N = 80; Non-Readers N = 33. 


Mean S.D. 
Mean Non- S.D. Non- i t 
Variables Readers Readers Readers Readers F TE 
43-Item Information Test 23.46 18.50 4.83 5.98 * 154 ae 
10-Item Test Before Reading 5.56 4.36 1.66 1.74 1.12 126 
10-Item Test After Reading 7.80 7.30 1.75 2.21 
Gains in Correct Response on 2.00* 
10-Item Test After Reading 2.24 2.94 1.61 1.86 13: C 


* Significant at the 5% level, 
** Significant at the 1% level. 


Summary and Discussion 


When easier and harder versions of 12 
articles were printed in three monthly issues 
of a company newspaper and two samples of 
128 employees were tested, it was found: 

1. Subjects exposed to harder versions suc- 
ceeded as well on a 43-item information test 
as those exposed to easier versions. 

2. Harder versions succeeded as well as 
easier versions in attracting readers to two 
articles. 

3. Subjects who read easi 
articles in a test situa 
better on a 10- 


er versions of two 
tion did significantly 


item test of comprehension 
than those who read harder versions. 


This result indicates that readability 
formulas can predict some differences in com- 
prehension between versions of the same 
material, 

4. Readers of two arti 
cessful on the 43-item te 
the 12 articles than thos 
either of the two article 
hensibility. 

These results 
formulas can be 
in comprehension 
the same material 


cles were more suc- 
St of information in 
e who had not read 
S tested for compre- 


indicate that readability 


used to predict differences 
between two 


time periods. 
readability factors, su 
this study, did not i 


tween easier and harder versions suse 
that investigation of motivational factors re 
herent in content is most crucial where n 
dividuals select what they want to read R ty 
learn. This does not gainsay the poste 
greater importance of readability pre n 
dividuals are required to read and study 
in classroom and training situations. 

Received April 4, 1952. 


References 


ict- 
1. Dale, E., and Chall, J. S. A formula for pred ate 
ing readability. Educ. Res. Bull, Obio 
University, 1948, 27, 11-20 and 37-54- 
2. Flesch, R. How to test readability. New à 
Harper & Brothers, 1951, idation 
3: Gilian, C. F. The construction and a 
of a disguised intelligence test to ‘plished 
in public opinion interviewing. ee ty O 
Ph.D. thesis, University of Minnesota, akes 
4. Gray, W, S., and Leary, B. E. What wa bi- 
book readable, Chicago: University © 
cago Press, 1935, Flesch 
otchkiss, S. N., and Paterson, D. Ge sychols 
readability reading list, Personnel 
1950, 3, 327-344, vane indict 
6. Klare, G. R. Evaluation of quantitative per 
of comprehensibilit y in written commun Min 
Unpublished Ph.D. thesis, University © 
nesota, 1950, 5 
udwig, M. Hard words and human i 


their effects on readership. Journ. 
1949, 26, 167-171, 


ship- 

8. Sanford, F. H. Authoritarianism and vi 
Philadelphia: Stephenson-Brothers, 19 ship: $ 
Wanson, C, E Readability and reader 1948) 


controlled experiment, Journ, Quarts 
25, 339-343. 


york: 


5. Hi 


~ 


ate- 

10. Wilson, M. C: The effect of amplifying yehobi 
rial upon comprehension, J, educ. PS 

11. y 1°47 38, 149-156, anei Ea £ 

< Yoder, D., Heneman, H. G., Jr., and Cheit, j. 1% 

Triple audit of industrial relations. firi 


i É . f 
ndustrial Relations Center, University ® 
Nesota, 1951. á 


a ae — 
= Cas 


Tue Jour, Y. n re > 
Vol. 37, Nee ie Psycnonocy 


A Note on Pre-testing Public Opinion Questions 


Robert C. Nuckols 


Life Insurance Agency Management 


_ There are probably no individuals engaged 
în measuring attitudes or public opinion who 
Would not agree that it is wise to pre-test 
Questionnaires. Many would probably say 
that the conventional pre-tests are conducted 
efficiently and result in well designed and 
adequately worded questionnaires. About 
this latter point there is some doubt. 

Several years ago this writer conducted 
a pilot study of respondent comprehension 
Using a battery of “typical” opinion ques- 
tions. The results of this study seem to 
shed some light on the question of the ade- 
quacy of our present pre-testing methods. 


Procedure 


Nine questions were chosen from “The 
Quarter’s Polls,” and were presented to a 
randomly selected group of 48 middle-income 
respondents in Cincinnati and Centerville, 
Ohio. The questions were selected to cover 
a wide range of reading difficulty as judged 
by the Flesch readability formula.* Of the 
nine questions, one had a difficulty equal to 
the adult average level as defined by Flesch, 
four were above and four below this level of 
difficulty. A second criterion for the selec- 
tion of questions was that they be of topical 
interest to the respondents at the time this 
study was being conducted. 

To test the respondents’ comprehension of 
the questions, a rather simple procedure was 
used. The question was presented to the 
respondent and after his answer had been 
given he was asked to repeat in his own 
words the meaning of the question as nearly 
as he could. The interviewer then recorded 
the respondent’s interpretation verbatim. 
The order of question presentation was varied 
from respondent to respondent. 

There are a number of criticisms that one 
Could level against this method of measur- 
ing comprehension. It may be argued that 


pa lesch, R. The art of plain talk. New York: 
arper and Brothers, 1946. 


Association, Hartjord, Conn. 


merely because a person can parrot a ques- 
tion, it does not necessarily follow that he 
comprehends its meaning. On the other 
hand, if a respondent gives a faulty inter- 
pretation, it seems fairly safe to conclude 
that he did misinterpret it. This would 
probably lead to an underestimation of com- 
prehension, certainly not an overestimation. 

The respondents’ interpretation of each of 
the questions was judged to fall into one 
of four categories: (a) correct interpreta- 
tions, leaving out no vital parts; (b) gen- 
erally correct replies, or replies in which no 
more than one of the parts was altered or 
omitted; (c) partially wrong interpretations, 
but showing the respondent knew the gen- 
eral subject of the question; (d) completely 
wrong interpretations or no-response. As 
an example of the scoring take the question: 
“Suppose the government had no control 
over how the businesses are run in this coun- 
try, who do you think this would help the 
most—the people as a whole, or those who 
run big businesses, or those who run small 
businesses?” 

A partially correct interpretation was: “If 
there weren’t any control, which would have 
the greater power—the small business or the 
larger.” 

A partially wrong interpretation was: 
“Bout government owning business—who 
would benefit most, big businesses or small 
businesses.” 

Or: “Just who would get the business— 
the big guy or the little guy?” 

An example of a completely wrong inter- 
pretation was: “Something about having a 
President. If he does things that people 
don’t agree with, they have a right to tell 
him—like Walter Winchell.” 

The responses to the questions were judged 
individually by each of two judges. In case 
of disagreement, the response was discussed 
until agreement could be reached as to which 
interpretation category it belonged. 


119 


120 


Results 


There were 430 question interpretations of 
which 73, or 17.0 per cent, were either wholly 
or partially wrong. Two respondents did not 
make an interpretation of one question be- 
cause, in one case, the telephone rang and, 
in the other case, something was boiling 
over on the stove. 

The findings would not be startling if one 
could say that these questions have now 
been pre-tested and can be re-worded so as 
to make them more comprehensible. How- 
ever, the questions used in this study had 
already been Presented to large cross sec- 
tions of the general Public by well known 
polling organizations. That is, these are 
questions after they presumably have been 
subject to the usual pre-test. 

If the questions had been asked of the 
respondents and only their answers recorded 
in the usual way these errors of comprehen- 
sion would not have been detected. In no 
instance did a respondent say that he did 
not hear a question or that he misunderstood 
it. The questions were asked, answers given, 
and all seemed well. 

If one grants that some degree of respond- 
ent comprehension may be missed in the 
usual pre-test, it still may be asked if this 
error contributes to any inaccuracy in poll 
results. From this study, -the answer seems 
fairly clear. Four questions contributed by 
far the most to the total amount of mis- 
comprehension. In two of these questions, 
there was a marked and Statistically signifi- 
cant tendency for those not comprehending 
to reply “don’t know.” On one question there 
was a significant tendency for the non-com- 
prehending respondent to answer “approve” 
to a question dealing with the United Na- 


tions. This institution was enjoying a high 
degree of popularity at the time this study 
was conducted, and hence lik 


ing the words “United Nations” 


In this instance, the 


about placing atomic energy under UN 


con- 
trol. There was no tendency evident 


Robert C. Nuckols 


Table 1 


Relation Between Readability and Comprehensibility 
of Nine Opinion Poll Questions 


Estimated Reading Grade No. of 
Placement Miscompre- 
(Flesch Score) hensions 
o eee ee 
5.8 1 
6.1 1 
7.2 14 
7.6 14 
8.5 14 
11.0 1 
12.8 14 
14.0 3 
17.2 11 
ea E ne 


those miscomprehending the remaining aa 
tion to reply differently from the rest af to 
sample. This in itself might be damning 
that question. P 

Because the sample of questions is small, 
and because several of the questions pelted 
equal miscomprehension scores, the sebility 
tion between comprehension and reada tudy 
has not been presented here. This z the 
was not designed to be a validation © 
Flesch index; however, since there may he 
some interest in the relationship found, m- 
Flesch score of each question and the a 
ber of Persons miscomprehending the T 
tion are presented in Table 1. The rade 
ber miscomprehending the questions inc jn- 
those making wrong and partially wrong 
terpretations, 


Summary 


d 
From the results noted here, it si 
seem that Conventional pre-testing fa! mis 
uncover many questions that are later sur- 
iterpreted by respondents in the mam 
vey. And it would seem that the failur® 2 
Word some questions so as to bring resP° re- 
ent comprehension to a maximum may lts- 
sult in distortion of the survey e e 
Hence, a few extra minutes spent gam o 
Some rough measure of comprehensibilin s 
the questions may well pay ample div! 
1n increased Survey accuracy. 


Received May 14, 1952, 


THE Journ, = aie Feye 5 
Vol. 37, Nee ee PsycHoLocy 


A Study of Respondent Forewarning in Public Opinion Polls * 


Robert C. Nuckols 


Life Insurance Agency Management Association, Hartford, Conn. 


Most of us have had the experience of 
being called upon unexpectedly to give an 
Opinion about some question, state a course 
of action, or criticize some proposal in an 
intelligent manner. It is possible that in 
Such situations we made replies that we later 
recognized as missing the point, as not fully 
expressing our position, or that would have 
been more valuable if we could have thought 
Of this, that, or the other alternative. It is 
Conceivable that a large proportion of re- 
Spondents to the typical opinion poll find 
themselves in a similar position. The re- 
Spondent may give a forced answer to the 
Persistent probing of the interviewer. How- 
ever, after the interviewer has gone these re- 
Spondents may recall many pertinent bits of 
information or opinion that would clarify, 
amplify, or even change their original posi- 
tion. These additional remarks on the part 
of the respondent should be of some interest 
in the analysis of opinion. 

This study was undertaken to determine 
the effects of forewarning the respondents 
of a “typical” opinion poll of the purpose and 
nature of the approaching interview. 

It is hypothesized that forewarning by 
means of an introductory letter will give the 
respondent an opportunity to think about 
and discuss the various topics listed in the 
letter and so be prepared to give more de- 
tailed and thought out answers than he would 
with no such opportunity. It is also hy- 
pothesized that by forewarning, the respond- 
ent will be more prepared to cooperate with 
the interviewer and therefore make the in- 
terview more enjoyable, both from the in- 
terviewer’s and the respondent’s point of 
view, 


jë This study is part of a dissertation submitted in 

Partial fulfillment of the requirements for the degree 

a Doctor of Philosophy at the Pennsylvania State 

ollege. The study was completed while the investi- 

pi was a fellow in psychology of the Britt Foun- 
n. 


Method 


Two community surveys, one in Altoona 
and the second in Williamsport, Pennsyl- 
vania, were conducted to test these hypothe- 
ses. The samples were drawn from the most 
recent city directory on an every nth dwelling 
unit basis. One of these sub-samples, com- 
prising 60 per cent of the total sample in 
each city, was designated for the sending 
of the forewarning letter. Letters were sent 
to more than one-half of each sample to 
allow for the normal number of substitutions 
and refusals. The letters were sent so as to 
be received at least three days before the 


interview. 
The forewarning letter read as follows: 


Dear Residents of Altoona: 


Many things, both big and small, are impor- 
tant to you in deciding whether or not a city is a 
good place in which to live. In an effort to make 
Altoona a better place in which to live, a study 
is being made by Pennsylvania Surveys at the re- 
quest of the Altoona Chamber of Commerce. 

To meet this aim it is important for you, as 
residents of Altoona, to speak your thoughts and 
opinions on several topics of community interest. 
Only you and your neighbors can paint a true pic- 
ture of your city. We feel sure that you will co- 
operate to help make this project a success. 

On Tuesday, November 28, a representative of 
Pennsylvania Surveys will call on you at your 


home. He, or she, will ask: 


About the transportation service, within Al- 
toona, and into and out of Altoona. 

About business and industry in Altoona. 

About services provided by the city govern- 
ment of Altoona. 

About the amount and kind of public recrea- 
tion available in Altoona. 

About the housing situation in Altoona. 

About the public schools. 

For over-all suggestions that would make Al- 
toona a better place in which to live. 


We realize that you are concerned with many 
problems other than those of a purely local in- 
terest. Therefore, we have sent you this letter 
so that when the representative calls you will 
have had some time to think about these prob- 
lems. We hope you will think about these top- 


121 


124 Robert C. Nuckols 


non-forewarned. As is noted in the table, 
this hypothesis was not confirmed in either 
city. 

In both cities there was only a negligible 
tendency for the forewarned respondents to 
give more answers to the open-end questions 
than the non-forewarned. In neither city was 
the difference close to being significant. 

The forewarned group in the preceding 
analyses contained respondents who did not 
report receiving the letter, and those who 
reported not understanding the topics. To 
further test the effects of forewarning, those 
respondents in Williamsport who claimed to 
have received the letter and to have under- 
stood its meaning were selected out of the 
over-all forewarned group. A sample of re- 
spondents was drawn from the non-fore- 
warned group to match as completely as 
Possible the informed-forewarned in respect 
to age, sex, socio-economic status, and educa- 
tional attainment. These two groups were 
then analyzed on the same variables as dis- 
cussed above. Only those questions cover- 
ing topics that were mentioned specifically in 
the forewarning letter were included for 
analysis. 

If there was any marked tendency for 
these informed-forewarned respondents to 

change their responses on the basis of the 
opportunity to discuss and think about the 
topics then it should turn up in this matched 
sample analysis. However, in no instance 
were the hypotheses verified. That is, there 
was no tendency for the informed-forewarned 
group to give more responses to open-end 
questions, fewer “don’t know” responses, less 
stereotyped or fewer non-reality replies, or 
to accept more extreme statements of opinion. 

Returning once more to the full samples, 
the hypotheses concerning respondent co- 
operation were analyzed next, Because some 
of the interviewers failed to record refusals, 
it was impossible to determine the effects of 
forewarning upon the refusal rate. How- 
ever, with data obtained from the original 
sample listings it was possible to analyze the 
tate of substitution at forewarned and non- 
forewarned addresses. A difference of six 
per cent was obtained in the direction of 
fewer substitutions at addresses to which let- 


ters had been sent. This difference was sig- 
nificant at the 10 per cent level of confidence. 
Interviewer ratings of the respondent’s 
cooperativeness, eagerness to discuss the ques- 
tions, and apparent information showed dif- 
ferences in favor of the hypothesis. In on 
Altoona and Williamsport the forewarne 
respondents were rated significantly Tia 
cooperative than the non-forewarned. 3 
neither city was a significant difference foun 
in respect to the respondent’s eagerness p 
discuss the questions, but the differences a 
did exist were in the predicted direction. T 
interviewers did not rate the Altoona ar 
warned respondents as being more iiorme 1 
however, there was a significant difference i 


favor of the forewarned group in Williams- 
port. 


Discussion 


When considering the questions individ- 
ually it was found that the number of a 
nificant differences in this study could = 
been found on the basis of chance aot 
Secondly, the differences that were found a 
not consistently in the predicted direc ee 
nor were they consistent between the hich 
cities. The matched-sample study, pate 
tested the hypotheses under the most Pe 
ous conditions, did not disclose any | that 
ences that would uphold the hypothesis ning 
the forewarning letters lead to more mea 
ful or a greater number of responses the 

The general lack of effectiveness 
forewarning letter may have resulte possi: 
several uncontrolled variables. It P 
ble that the forewarned respondents, 1 
more completely or given more time A 
their answers, would have given meaning: 
sponses and responses with more T ronge! 
It is also possible that the older and 5 typed: 
opinions, those most likely to be a did 
would be given first. If the interview purty 
not record all the answers or were e 0 
to get to the next question the € 
forewarning might be nullified. i 
while this may all be true, it is d 
the interviewers were well motivated, ability 
competent job, and were comparable in many 
to the typical interviewer used 1 
market research studies. 


Respondent Forewarning in Public Opinion Polls 


Another factor that might account for the 
negative findings is the letter. If the letter 
had not been mimeographed, but rather had 
been made more attractive or had spelled out 
the topics in a simpler or more understand- 
able fashion, the respondents might have 
been more motivated to read the letter and 
take the suggested action. On the other 
hand, the letter had been pretested on a 
small sample in Altoona and checked for 
readability. Short sentences, short words, 
and large type were used, hence the letter 
should have been understandable to any 
person who could read a newspaper. While 
a more attractive letter might have secured 
more readers, the results of the matched- 
sample analysis showed readers to respond 
no differently from the non-forewarned. 

The negative findings might have resulted 
because of the nature of the survey content. 
The’ respondents might have been more in- 
clined to think about and discuss the na- 
tional administration or the most recent base- 
ball trades, rather than purely local issues. 
Here we may have the most logical explana- 
tion of the findings. This study may well 
have served to point out once again the pub- 
lic’s indifference to civic affairs. In many 
of our cities well-documented exposés of civic 
maladministration or the pressing need for 
certain improvements fall on an unrespon- 
sive public. It may not be too far-fetched to 
believe that the forewarning letter met a 
similar fate. 

The interviewer ratings are subject to the 
criticism that they were made after deter- 
mining if the respondent had received a fore- 


125 


warning letter. Nevertheless, there is some 
subjective evidence that tends to uphold the 
general validity of the ratings. 

The interviewers either volunteered or were 
asked whether or not they had the feeling 
that the forewarning letter made any differ- 
ence in the respondents’ cooperativeness. 
Many of the interviewers reported that they 
felt the letter did help in securing rapport 
and no interviewer reported that the letter 
made the respondent more suspicious or un- 
cooperative. The interviewers claimed that 
they could predict the forewarned respondent 
with some accuracy before asking for knowl- 
edge of the survey. Moreover, the inter- 
viewers were not told the purpose of this 
study. They knew that some respondents 
would know of their coming; however, they _ 
were led to believe that this was primarily 
a check on their honesty in meeting the as- 
signment. Therefore the hypothesis of in- 
creased cooperation would have to be arrived 
at individually during the course of the in- 
terviewing period. 

If neither of these lines of argument vali- 
dates the assumption of increased respondent 
cooperation it might be further argued that 
the question of the validity of the ratings is 
unimportant. If interviewers believe that 
forewarned respondents are more coopera- 
tive it makes very little difference whether 
they truly are more cooperative or not. It 
would seem that forewarning by mail can 
be an effective factor in making interview- 
ing a more pleasant occupation, and that it 
can be done fairly inexpensively. 


Received June 16, 1952. 


THE JOURNAL or APPLIED Psycnotocy 
Vol. 37, No. 2, 1953 


Influence of Ink Color on Handwriting of Normal and 
Psychiatric Groups * 


Walter A. Woods 


Richmond Professional Institute, Richmond, Virginia 


Color psychotherapy has aroused new in- 
terest in recent years and efforts have been 
made to re-establish the reputation of this 
declining field. Some writers have suggested 
that color vision is influenced by emotional 
states. Kravkoy (3) has found that, under 
adrenergic influence, the retina is more sensi- 
tive to blue 


“green and less sensitive to red- 
orange. 


From such studies it has be 


en inferred by 
some that colors, 


as sensations (apart from 
symbolic content), are influential in produc- 
ing states of emotion. In a recent popular 
book, Color psychology and color therapy, 
Birren (1, p. 150), draws upon such research 
to conclude: “To state a principle, it seems 
that the immediate action of any color stimu- 
lation is followed in time by a reverse effect. 
Red increases blood pressure, which later be- 
comes normally depressed. Green and blue 
decrease blood Pressure and later cause it to 
rise... .”? 

Birren relies on the work of Gold: 
for this generalization. 
Goldstein’s observation 
could. say red is inciti 
favorable for emotionally. 
green creates the condit 
and exact fulfillment of t 
be suited to Produce t 
ground out of which 
emerge; in green th 
veloped and the actio; 

One might inquire concerning the basis on 
which Goldstein formulates these “principles.” 

t of his research is found in an article 
appearing in Occupational Therapy and Re- 
habilitation (2). Inasmuch as he merely 
refers to the research and describes neither 


research conducted 


stein (2) 
He calls attention to 
(1, p. 149): “One 
ng to activity and 
determined actions; 
ion for meditation 
he task. Red may 
he emotional back- 
ideas and actions will 
ese ideas will be de- 
ms executed.” 


* From 
State College. 

aramore, Sup 
cal Psychologi: 
help in this Project. 


ansas 


“9 


126 


Procedure nor data, it is impossible to E 
termine how he arrived at his a 
The general nature of his findings eT 
activity which takes place under red ES 
or in which red equipment is used will e 
to be performed in a more emotional men Hi 
whereas activity engaged in under pa Wat 
or with green equipment will be “though (no 
in nature. He describes an experiment te 
data included) in which it was eon 
that a subject with arms extended in a 
would, when illuminated with red light, tê 
to move his arms outward. If agree: e 
with green light, he would tend to drav also 
arms together in front of the body. a or 
discusses the influence of colored lig und: 
colored ink on handwriting. He k (if 
“Words written in red ink or green Y show 
the patient pays attention to the color) nces 
different size of letter and different distanca 
between the letters, Handwriting 1n filar 
light or with green ink is much more S re 
to the normal handwriting than that m 
light or in red ink” (2, p. 149). er 
Contrary findings are reported by beet 
(9). He is unable to verify that pe 
the subjects held forward and parallel 
toward red and away from blue light. t 
Lukens and Sherman (4) found tha 
use of red, black or white materials 
tients in weaving produced no diffe 
results in woven objects. ficting 
In view of the inconclusive and con ch of 
nature of the evidence on which me o 
the contemporary opinion concerning 
therapy is based, fundamental, piian- aide 
periments are necessary. The present 
reports such an experiment. 


the 
a 


renti@a 


The Experiment 


d of 
A total of 132 subjects were use wer? 
these, 66 were college students and 6° 14 


1 
Patients in the State Mental Hospit” 


Influence of Ink Color on Handwriting 


Larned, Kansas, all classified as psychotic, 
Psychoneurotic, or psychopathic personality 


and all in a state of remission suitable for - 


Occupational therapy and engaged in occu- 
Pational therapy programs. 

Each subject was asked to write the fol- 
lowing statement in each of three colored 
inks, red, green, and black, and with a pen- 
holder which corresponded to the ink color. 


Dear Joe, We received your letter and 
expect to see you next week, (signature) 

Subjects were asked to write this state- 
Ment on a sheet of white paper, 54% x 8% 
inches. Their attention was repeatedly di- 
rected to the fact that different ink colors 
were being used. 

The particular statement was selected after 
preliminary experimentation, since it met the 
following requirements: (1) it was of such 
length that the average writer would not be 
tempted to cram it onto one line but could 
easily write it on two lines (length of the 
Material should not influence choice of size 
or form of handwriting); (2) it was not so 
long that fatigue would be introduced; and 
(3) it was symbolically as meaningless as 
Possible yet retained literary form. 

The two major groups were subdivided into 
six sub-groups of 11 normals and 11 abnor- 
mals, in order to equalize the effect of the 
order in which the different colors were used. 
Group I used inks in the order R Bl G; 
Group II, R G Bl; Group III, G R Bl; 
Group IV, G BI R; Group V, Bl R G; and 
Group VI, Bl G R. 

This design supplied a total of 396 hand- 
writing samples, 132 in red ink, 132 in black 
ink and 132 in green ink, one third of each 
ink color having been written first, one third 
second and one third last in the series. 

Handwriting samples were measured on 
a millimeter scale, and means were deter- 
minded for each sub-sample. Variance esti- 
mates of the sub-groups and major groups 
were made. 


Results 
Sub-sample, border, and total sample means 
are given in Table 1. Color (column) means 


do not differ appreciably, nor do the order of 
Writing (row) means. However, means for 


Table 1 


Means of 18 Groups of 22 Samples Each (nk equals 
396) of Handwriting Classified According to: (1) Ink 
Color in Which Written; (2) Order in Which Ink is 
Used; and (3) Psychiatric Classification of Writer 

(Measurements in millimeters) 


NP Color of Ink 
Classifi- Order 
cation Red Green Black Means 
1 Normal 20.9 22.4 7 21.7 
NP 26.9 26.5 1 26.9 
Order of }2 Normal 206 214 21.5 21.4 
writing NP 27.4 27.1 24.7 26.4 
3 Normal 22.2 217 2.1 22.0 
NP 26.4 26.7 7.5 26.8 
Color Normal 21.6 21.8 21.7 2057 
means NP 26.8 26.4 26.7 
Analysis of variance 
Variance 
Source df Estimate 
Ink Color 2 A 
Order of Writing 2 11.26 
NP Classification 1 3517.76 
Color/order 4 14.22 
Color/NP Class 2 3.53 
Order/NP Class 2 .52 
Order/Color/NP 4 18.20 
Individual Diff. 359 9.66 
Order NP Class Color 
= 1.1; —— = 364.1; = 
Ind. Diff. Ind. Diff. ~ °°} Toa pin = 14 


normal and for psychiatric groups differ in 
every instance. 

F test reveals that variance ratios in every 
instance are such that they would be ex- 
pected by chance, except in the instance of 
differences between normal and abnormal 
groups. These differences are significant at 
the .05 level of probability. Variations not 
due to difference in psychiatric classification 
are due to individual differences. F’s are 
so small as to leave no doubt that the hy- 
potheses must be accepted that differences 
due to color of ink used, order in which the 
sample is written, interaction between color 
and order, interaction between order and 
psychiatric classification, interaction between 


128 


ink color and psychiatric classification, and 
interaction between ink color, order of writ- 


ing and psychiatric classification are those - 


which might be expected by chance from a 
random sample of handwritings. 

It would be interesting to discover what 
it is that contributes to the significant differ- 
ences which exist between normal and psy- 
chiatric handwriting samples. However, the 
design of the present experiment does not 
permit inquiry into this matter. 


Summary 


Color of ink employed in handwriting has 
no influence on the size of the handwriting. 

Popular concepts concerning the influence 
of colored equipment or colored lights on 
motor performance (and Possibly on emo- 
tional affect) must be revised until or unless 
more substantial evidence is uncovered to 
support these ideas. Nothing in the present 
experiment supports occupational 


therapy 
based on the influence of single colo 


rs. 
Received December 13, 1952. 
Early publication, 


~ 


10. 


: sis 0 
- Prescott, B. D. The psychological analysis 


- Reeder, 


Walter A. Woods 


References 


- Birren, E. Color psychology and color therapy. 


New York: McGraw-Hill, 1950. 


- Goldstein, K. Some experimental observations 


concerning the influence of colors on as 
function of the organism. Occup. Ther. 
habilit., 1942, 21, 147-151. 


; ee a nic 
- Kravkov, S. V. Color vision and the autonom 


z 1, 
nervous system. J. opt. Soc. Amer., 1941, 3 
335-337, 


"mia effect! 
- Lukens, N. M., and Sherman, I. C. The effec’ 


rchotic 
of color on the output of work of ee 
patients in occupational therapy. Occup. 
Rehabilit., 1940, 20, 121. 


- Orr, M. E. Color therapy. Occup. Ther. Ke 


habilit., 1942, 21, 33-40. 


ew 
- Podolsky,E. The doctor prescribes colors. Ne 


York: National Library Press, 1938. í 
Jil, 1942 

light and color, Occup. Ther. Rehabilit., 194 

21, 135-146. 

J. E. The psychogenic color 

Amer. J. Ophthal., 1944, 27, 358-361. 


field. 


hae rect of 
- Vollmer, H. Studies in the biologic effec 


colored light. Arch. Phys. Ther, 1938, 
197-211, 


12, 
Mass. Assoc, for Occup. Ther. Bull, 1938: 


6-7, 


Tue Journat or AppLIED PsycioLocy 
Vol. 37, No. 2, 1953 


A Punched Card Procedure for Use with Partial Pairing 


James E. Oliver 


Cadillac Motor Car Division, 
General Motors Corporation, 
Detroit, Mich. 


In using the method of paired comparisons, 
McCormick and his students (2, 3) have 
drawn attention to the feasibility of using 
partial pairing, as opposed to complete pair- 
ing, and have reported its use relative to the 
rating of employes. The partial pairing tech- 
nique should, under any circumstance, result 
in the abbreviation of the time required for 
the preparation, rating, and scoring of pairs 
in proportion to the extent that pairing is 
partial rather than complete. 

In a previous article (1) a procedure was 
discussed for the use of punched card equip- 
ment to facilitate rapid preparation and scor- 
ing of a complete pairing deck in accord with 
the traditional use of the paired comparison 
technique. This procedure can be tenably ap- 
plied with equal facility to prepare a partial 
pairing deck. It should be particularly useful 
in cases where the method is used with V's of 

` 15-20 or greater. 


Partial Pairing 


If N is an even number, the minimum num- 
ber of pairs needed in a partial pairing deck 
is that required to give each of N individuals 
opportunity to receive at least one choice. 
The minimum number of pairs needed when 
N is an odd number, however, is that re- 
quired to give each of N individuals oppor- 
tunity for two choices. The composition of 
such a minimum partial pairing deck has 
been described by what Kephart and Oliver 
(1) have arbitrarily termed “set.” Departure 
from complete pairing is conditioned by the 
number of “sets” incorporated in a partial 
Pairing deck. The number of “patterns” (2) 
that may be used with any particular V is 
the number of possible combinations of “sets.” 
In this respect, of course, the inclusion of all 
Sets results in complete pairing. 

z If N is an eyen number, there are N /2 sets 
m a complete pairing deck. As an example. 


consider an NV of 6, permitting numbers to 
represent names being paired. 


Set 1 Set 2 Set 3 
1-2 1-3 1-4 
2-3 2-4 2-5 
3-4 3-5 3-6 
4-5 4-6 4-1 
5-6 5-1 5 - 2| DESTROY 
6-1 6-2 6-3 
One-half of set 3 is destroyed since each 


half contains the same three pairs and is 
extraneous to a complete pairing deck. The 
three remaining pairs in set 3 give each of 
the six individuals opportunity for one choice. 
Set 1 and set 2, each composed of 6 pairs, 
give each of the 6 individuals opportunity for 
2 choices. Therefore, we can pair everyone 
with one other individual by using only set 
3, everyone with 2 other individuals by using 
either set 1 or 2, everyone with 3 other in- 
dividuals by the combined use of set 3 with 
either 1 or 2, everyone with 4 other individ- 
uals by the combined use of set 1 and 2, or 
complete pairing by the use of all three sets. 
The small Ņ of 6 is used for illustrative pur- 
poses only, but the same principle is opera- 
tive for an even NV of any size. For ex- 
ample, if N were 50, we would have 25 sets. 
Pairing can be made partial in multiples of 
1, and the extent to which it is partial is de- 
termined only by the number of sets incor- 
porated in the final deck to be used. 

If N is an odd number, there are (N 
—1)/2 sets. As an example, consider an N 
of 7, again permitting numbers to represent 
individual names in the pairs. 


Set 1 Set 2 Set 3 
1-2 1-3 = 2 
2=3 2-4 2-5 
3-4 3-3 3-6 
4-5 4-6 4-7 
5-6 ST o> 1 
6-7 G1 6-2 
7-1 T-2 TEs 


130 


Although one-half of the last set is al- 
ways destroyed when WN is even, this is not 
characteristic when N is odd. Each of the 
three sets above consists of 7 pairs, and 
gives each individual opportunity to receive 
2 choices. We can, therefore, incorporate 
either set 1, 2, or 3 into a partial pairing 
deck and have everyone paired with 2 other 
individuals, or use any two sets to pair 
everyone with 4 other individuals. The use 
of all 3 sets results in complete pairing. 
Therefore, when N is odd, pairing can be 
made partial in multiples of 2, and the extent 
to which it is partial is determined by the 
number of sets incorporated in the deck. 


Summary 


The method of paired comparisons has 
long been considered somewhat laborious to 
say the least. In a previous article (1) a 
punched card procedure was outlined to facili- 
tate rapid preparation and scoring of the 


James E. Oliver 


pairs as the method has been traditionally 
used. The discussion of the punched card 
procedure has here been extended to draw 
attention to its applicability to partial’ pair- 
ing, a technique to further abbreviate time 
and labor requirements in preparing, sana 
and scoring the pairs. The procedure i 
systematic and may be used with any num 
ber of variables. 


Received June 7, 1952. 


References 


m hed 

1. Kephart, N. C., and Oliver, J. E. A oe at 

card procedure for use with the a 1952, 
paired comparison. J. appl. Psychol, 


36, 47-48. Fi 
2. McCormick, E. J., and Bachus, J. A Pea 
comparison ratings. I. The effect on T% g, 


i . app 
of reduction in the number of pairs. J 


Psychol., 1952, 36, 123-127. 7 ired 

3. McCormick, E. J., and Roberts, W. K. ee 
comparison ratings. II. The reliability ? Psy- 
ings based on partial pairings. J. appr. 
chol., 1952, 36, 188-192. 


ee Journat or Applen PsycnoLosy 
ol. 37, No. 2, 1953 


Pointer Location and Accuracy of Dial Reading 


Sherman Ross, 
William Ray 


and 
Louis Della Valle 


University of Maryland 


_Accuracy of dial reading and the condi- 
tions of which it is a function constitute a 
Problem of interest to those psychologists 
who are concerned with display problems. 
The facts of the relationship between ac- 
curacy and its determinants have important 
applications to industrial and military situa- 
tions, Reviews of the previous work accom- 
plished have been presented in several sources 
(1, 3,5, 8). 

Although a considerable amount of experi- 
mental effort has been expended in this area, 
little attention has yet been paid to the 
Specific questions with which the present study 
was concerned. In general, this experiment 
attempted to determine the relationship be- 
tween the accuracy of reading and the dial 
sector and specific location of the dial pointer. 

Kappauf and Smith (7) found that the 
sector had no consistent effect on either local 
errors or systematic errors for many dials, 
but sector location may influence the occur- 
rence of specific systematic errors on certain 
scales. Dials graduated from 0 to 50 and 
O to 100 revealed an error more prevalent on 
right dial halves than on left halves on scales 
numbered by tens. 

Christensen (2) studied exposure time as a 
factor in dial reading performance. Moving 
scale dials were better at short exposures 
while moving pointer dials were better at long 
exposures. Sleight (9) compared dial shapes 
for legibility. In the order of accuracy of 
readings the dials ranked as follows: (1) 
open-window; (2) round; (3) semicircular; 
(4) horizontal; and (5) vertical. 

In a study of instrument recording per- 
formance under varied illuminating condi- 
tions, Spencer (10) reported readings most 
accurate at the 12 o’clock sector of the dial, 

ut his results were not consistent. In stud- 


ies of check reading of fixed-scale, moving 
pointer instruments, Warrick and Grether 
(11) and Grether and Connell (4) reported 
more frequent correct responses when the 
index is at the 9 o’clock position than when 
it is at the 3 o’clock position. 

In a study of the effect of pointer design 
and pointer alignment position on speed and 
accuracy of instrument readings, White (12) 
had his subjects make a qualitative reading 
of the deviation from vertical among 16 


simulated engine instruments in order to 


make a correction. Alignment at the 9 
o’clock position was superior for qualitative 
reading. In another experiment his subjects 
had to check-read a panel of simulated in- 
struments with pointer alignment at the 9, 
12, 3, and 6 o’clock positions and indicate 
misalignment. No significant differences in 
response time and errors were found. Hor- 
ton (6) found an increase in the frequency 
of systematic errors with sector errors being 
more than twice as frequent on the left half 
of the scope as on the right. In an un- 
published study from this laboratory it was 
found that fewer errors were made at and 
around the 9, 12, and 3 o’clock positions and 
more errors were made at some intermediate 
points in a circular dial. Our results in this 
respect were not entirely consistent, how- 
ever, because in both groups there were mid- 
division settings which were not numbered. 

From the literature cited several findings 
are of particular interest in connection with 
the present study. Kappauf and Smith (7) 
found that sector had no consistent effect. 
When reversal errors were frequent, sector 
was then observed to be important. Spencer 
(10) reported more accurate readings in the 
12 o'clock sector, but his results were not 
consistent. White (12) found that the 9 


131 


132 


o'clock position was superior for reading of 
deviations from vertical. Finally, we ob- 
served a tendency for fewer errors at and 
around the 9, 12, and 3 o’clock positions. 

Three dial shapes were used in the present 
study in an attempt to answer certain ques- 
tions which can be raised concerning the in- 
fluence of sector and pointer location on 
accuracy of reading. These dials are: (A) 
semicircular upright dial; (B) semicircular 
inverted dial; and (C) circular dial (see Fig- 
ure 1). 

The following specific questions were asked 
concerning accuracy of reading the three 
dials: 


1) Are errors in a particular quadrant a 
function of the dial shape in which the quad- 
rant occurs? 


Fic. 1. The dial shapes used in the experiment, 


Sherman Ross, William Ray, and Louis Della Valle 


2) Are intra-dial errors for Dial C a func- 
tion of the quadrant in which readings are 
made? 

3) Are intra-dial errors for Dial C related 
in a systematic way to pointer positions of 9, 
12, 3, and 6 o'clock compared to intermediate 
positions? 

4) Are errors a function of the dial half 
(upper and lower) in which readings are 
made? 


Method and Procedure 


Subjects: The subjects used in the experi- 
ment were eight male and two female 
versity students. They ranged in age fron 
20 to 30 years. Each subject had a minimum 
Snellen index of 20/20 (corrected or unco" 
rected) in each eye. 

Apparatus: The apparatus used to ens 
the dial settings to the subjects was g yn 
fication of the Dodge tachistoscope. yhe 
terior was painted black, and the subjer" 
viewed a single dial through a binocular ar 
piece. The pre-adapting illumination @ i 
the presentation illumination were provi ce 
by two pairs of 25 watt bulbs. The nee 
from the subject’s eyes to the test dials W 
42 in. | to 

An electronic interval timer was uset a 
present exposure periods which were set 
0.1, 0.3, 0.4, 0.5, and 0.7 sec. the 

The dials were constructed to follow nd- 
design characteristics suggested in “ee 
ards to be Employed in Research on Cost 
Displays,” Armed Forces-NRC, Vision 
mittee, 1 March 1950. All characteris” 
the three dials were held constant as 5” 
below: kon 

1. All numbers were made by India a a 
white cardboard using a No. 3 pen 4° 
LeRoy lettering guide. 

2. The diameter of each dial was 

3. The distance between graduations g 
the circumference of the scale was %6 i” in 
the length of each graduation unit was 716; Dy 

4. The height of each numeral was 7 tel 
and the stroke width was approsi™? 
Yo in. at 9 

5. The O setting for each dial wa fck 
o'clock and the 10 setting was at 3 on 


nt 


y im 
Z a 


i 


Pointer Location and Accuracy of Dial Reading 


133 


Table 1 


The Error Scores for 


the Three Dials Tested 


Dial A Dial B 


Dial C Car- Inter- 


Upper Lover nny dinal mediate 

Subject Whole QI Qu Whole QUI QIV Half Half QI QI QII QIV Points Points 
1 35 1 2 u 3 8 1827 7 ou u og 0 6 
2 E 3 3 9 3 6 i5 i2 7 8 4 7 0 2 
5 17 7 10 5o 7 9 17 18 su 7u 2 10 
4 15 5 10 i 5 ó 19 10 6B 7 3 0 4 
5 4 4 10 Ș Q i 16 19 6 10 1 8 0 s 
g ig a 9 4 nu B 2 B 9 13 17 8 1 11 
7 13 3 «10 is 8 10 32d 16 16 16 7 1 13 
8 2% 15 1n u 4 71 ii 2 10 5 10 It 0 11 
a 4 1 3 15 10 6 7 9 4 3 1 9 1 6 
10 35 14 21 34o 15 9 17 16 7 0 9 9 2 6 


i 6. The pointer was 134g in. long and was 
7% in. wide. 

Each dial was mounted on stiff black card- 
board, The settings on the test dial were 
manipulated by means of a larger dial placed 
on the reverse side of the test dial. Thus, 
Settings on each dial could be quickly and 
conveniently changed. 

Procedure: After the subject was seated 
before the binocular eyepiece of the tachisto- 
scope, a dial was exposed for an unlimited 
exposure. The subject was shown the dial, 
and its units and graduations were pointed 
out. The subject was shown several settings, 
and was told that he would be required to 
report the pointer position. The pointer was 
set either on the graduation marker or mid- 
way between two graduation markers. The 
subject was also shown the dial under the 
conditions of timed exposures. The experi- 
menter called “ready” when a trial was to 
be started, and the click of the interval timer 
signaled the starting of the timed exposure. 
The subject reported “11,7 “OY,” “644,” 
“17,” etc. Dials A and B had 21 possible 
settings while Dial C had 40 possible settings. 

The order of presentation of dials, the set- 
ting on each dial, and the time interval were 
systematically varied in order to handle possi- 
ble practice and fatigue effects. Dial A and 
Dial B were each presented 105 times for 
each subject involving the 21 different dial 
Settings and the five time intervals tested. 


Dial C was presented a total of 200 times to 
each subject. Thus each subject made 410 
judgments involving the three dial faces 
tested, and the results presented are based 
on a total of 4,100 judgments. 


Results and Discussion 


The error score for a given individual for 
any set of dial readings was found by sum- 
ming twice the deviation from the actual 
setting. Thus, score = X (2E), where E is 
the deviation of the subjects reading from 
the actual dial setting. Each deviation was 
multiplied by two simply to eliminate deci- 
mals. These results are shown in Table 1 
for each of the three dials tested. The table 
shows the total error score for each individual 
for comparable sections, dial-wise or quad- 
rant-wise, for Dials A, B, and C. In addi- 
tion, for Dial C the total error score is shown 
for cardinal settings (0, 5, 10, and 15) and 
for intermediate settings (2, 3, 7, 8, 12, 13, 
17, and 18). The quadrants referred to are 
designated as follows: (I) upper-right, (II) 
upper-left, (III) lower-left, and (IV) lower- 
right. 

For any statistical test of significance a 
difference score was found for each individ- 
ual. The standard error was then computed 
from the distribution of differences, thus al- 
lowing for the correlation among individuals. 
The results of the tests of significance (¢ test) 
are shown in Tables 2; 3, and 4. 


134 


Table 2 


Significance of Difference Between Error Scores in 
Comparable Quadrants of Dials A, B, and C 


Quadrant Dial Comparison t-value 
I Avs. C 0.62 
I A vs. C 0.45 
Ul B vs, C 1.73 
IV B vs. C 0.01 
Table 3 


Significance of Differences Between Error Scores Made 
in Quadrants of Dial C Expressed as t-values 


Quadrant I I II IV 
I = 
I 1.89 — 
MI 2.13 0.48 — 
IV 0.44 0.81 1.14 — 
Table 4 


Significance of Differences in Total Error Scores in 
Dials A, B, and C Expressed as t-values 


Cc 
Dials A B (upper half) 
A = 
B 0.69 — 
C (upper half) 0.05 0.90 = 
C (lower half) 0.41 1.16 0.65 


Four sets of ¢ tests were made relative to 
the four questions Previously raised. The 
questions are restated here, as follows: 

1) Are errors in a 
tion of the dial sha) 
occurs? 


given quadrant a func- 
pe in which the quadrant 


dial errors (Dial C) a func- 
adrant in which readings are 


3) Are intra-dial er 


: 2 rors (Dial C) related 
m a systematic way i i 


tion of the dial half 


(upper vs. which readings are 


made? 
From the results of 
shown in Table 2, it i 


lower) in 


Sherman Ross, William Ray, and Louis Della Valle 


for the three dials used in the experiment has 
not been demonstrated to be an important 
factor in reading accuracy when the errors 
produced are considered on a quadrant basis, 
since all of the ¢-values shown are insignifi- 
cant at the 5 per cent level of confidence. | 
From the results shown in Table 3 dealing 
only with the errors produced in the ciren 
dial (Dial C), it may be concluded that = 
quadrant from which the settings are Er 
has not been demonstrated to be a geet 
factor in reading accuracy, since all of t : 
t-values shown are insignificant at the 5 pe 
cent level of confidence. ‘de 
The third major comparison to be oan 
ered in the analysis of the data is the eo 
of the comparison of error performance W. 1s 
errors made at dial settings 0, 5, 10, and a 
are compared with errors made at eee 
2, 3, 7, 8, 12, 13, 17, and 18 for the rose 
dial (Dial C). The ¢-value here is 5.89 
the difference is significant at the 01 le nae 
We, therefore, conclude that in the lat 
dial used in the study significantly me 
errors were made at the 9, 12, 3, and 6 0 diate 
positions than at the tested interme 
points. , 
Table 4 shows the comparison of the Ri 
curacy of reading the upper half of Di it 
with Dial A, the lower half of Dial C bie a 
Dial B, Dial A with Dial B, etc. Here E 
none of the ¢-values are significant at t ults 
per cent level of confidence. The = 
show that accuracy of reading in upper ntly 
lower dial halves does not differ significa 
in the set of dials used in this study. ted. 
One additional finding should be ng (2) 
The results of previous investigations were 
concerning the effect of exposure be E 
verified. Errors decreased as length © 
Posure time increased. 


Summary 


e 

The purpose of this experiment was E 

determination of the relationship between 
curacy of dial reading and the sector 

specific location of the dial pointer. | ht 

three dials used were a semicircular uP’ n f 
dial, a semicircular inverted dial, and 4 
cular dial. Ten subjects made a tota 


Pointer Location and Accuracy of Dial Reading . 135 


3. Fitts, P. M. Engineering psychology and equip- 
ment design. In Handbook of experimental 
psychology. S. S. Stevens (Ed.). New York: 


4,100 judgments at five exposure times on the 
three dials. 


Tests of significance for error scores were Wiley, 1951. 
made and permitted the following conclu- 4. Grether, W. F., and Connell, S. C. Psychologi- 
Sions: cal factors in check reading single instruments. 
7 sn 2 p USAF Memo. Rept. No. MCREXD-694-17A 
_ 1) Differences in dial shape were not an USAF, Air Materiel Command, 20 September 
Important source of error. 1948. Pp. 21: 


2) Differences in sector location of the 3. Handbook of human engineering data for design 


dial point z not an important source of engineers. Medford, Mass.: Tufts College, 
pinter were: not: an Impor 1949. Chap. IV, Sec. I. 


e 

BES: ee i 6. Horton, G. P. An analysis of errors made in a 
3) Significant differences in error scores schematic PPI display. USAF Tech. Rept. 

were found for readings made at 9, 12, 3, and No. 5960, USAF, Air Materiel Command, Oc- 


6 o'clock positions corresponding to pointer tober, 1949. 
pgs p g o p Kappauf, W. E., and Smith, W. A. Design of 


Settings at 0, 5, 10, and 20 when compared instrument dials for maximum legibility IIT. 
with intermediate points. Some data on the difficulty of quantitative 
- 4) No significant differences in error scores reading in different parts or a dial. USAF 
y ver dial halves Tech. Rept. No. 5914, Part 3, May, 1950. 
pe found i uppe» and lowen pia aa $. McFarland, R. A. Human factors in air trans- 
ere compared. ~~ ‘ port design. New York: McGraw-Hill, 1946. 
These findings suggest that critical regions ọ, Sleight, R. B. The effect of instrument dial 
of a scale should be assigned to the 9, 12, 3, shape on legibility. J. appl. Psychol, 1948, 
or 6 o'clock positions of a circular dial, and 32, 170-178. , ap : ; 
10. Spencer, J. Presentation of information by air- 
that factors other than errors may be con- craft instruments. II. Instrument recording 
sidered in the choice of a dial from among performance under varied illuminating condi- 
tions. Flying Personnel Res. Comm., No. 154, 


the three types studied here. 
ERES May, 1951. 


Received May 22, 1952. 11. Warrick, M. J., and Grether, W. F. The effect 
of pointer alignment on check reading of en- 
gine instrument panels. USAF Memo. Rept. 


i No. WCREXD-694-17, USAF, Air Materiel 

1. Chapanis, A., Garner, W. R., and Morgan, C. T. Command, 4 June 1948. 
Applied experimental psychology. New York: 12. White, W. J. The effect of pointer design and 
Wiley, 1949. pointer alignment position on the speed and 


accuracy of reading groups of simulated en- 
gine instruments. USAF Tech. Rept. No. 6014, 
USAF, Air Materiel Command, July, 1951. 


2. Christensen, J. M. Exposure time as a factor in 
dial reading performance. Amer. Psychologist, 
1951, 6, 387. 


Tue JOURNAL or AppLIED PSYCHOLOGY 
Vol. 37, No. 2, 1953 


Dimensional Analysis of Motion: V. An Analytic Test of 
Psychomotor Ability * 


Shelby Harris and Karl U. Smith 


University of Wisconsin 


The present paper describes a new test of 
psychomotor skills, based on dimensional 
and component analysis of movements in mo- 
tion. This test, which has been named the 
Analytic Reactometer, permits separate and 
automatic registration of the travel and ma- 
nipulation components of motion involved in 
the successive grasping and manipulating of 
objects.? This development in psychomotor 
testing is of considerable significance for 
several reasons: (1) the test provides more 
detailed and precise measures of the com- 
ponents of motion than has been previously 
possible; (2) it permits, within the same 
instrument, systematic variation of several 
dimensions of motion, such as extent of move- 
ment, direction of movement, extent of ma- 
nipulation, complexity of movement and ma- 
nipulation, plane of movement, hand involved, 
etc.; (3) it provides a means of analyzing 
errors of manipulation in terms of the vari- 
ous dimensions of motion and with regard 
to the component time scores; and (4) the 
principles involved in the test may be in- 
corporated in all types of psychomotor tests 
which may be designed to simulate various 
types of work situations. The desirability of 
a psychomotor test situation which will ac- 
complish the objectives named is indicated 
by the fact that different components and 
dimensions of movement in skilled motion 
patterns are functionally distinct (1, 2; 3,5, 
6) and often uncorrelated. The test to be 
described has some Significance for the field 


1 This study has been su; 
Pported by fu; Vi 
il e Legislature, The State of Wisconsin, and ooa DY 
y the Graduate School Research Committee The 
, 


he analysis of results of this study has been 
he Computing Service, The 


human manual performance in medical and 
industrial research. 


Methods 


The Analytic Reactometer is designed in 
terms of two main features: (1) control of 
the space dimensions of the motion pattern; 
and (2) separate measurement of the ma- 
nipulative and travel components of motion. 

The planned performance situation used in 
the present form of the test is a control panel 
45.7 cm. square, on which are mounted 25 
rotary switches in 5 rows of 5 switches, each 
spaced 7.6 cm. apart (Figure 1). Each 
switch has 17 settings, selected points of 
which are marked as shown in Figure 1. 
The positions thus marked are’ 40°, 80°, and 
180° clockwise and 40° and 80° counter- 
clockwise, 

The manipulative and travel components 
of motion are measured separately in this 
test by means of an electronic motion ana” 
lyzer (4), consisting of a balanced relay ci"- 
cuit, in which the subject acts as a key- 
When the subject touches one of the switches» 
the analyzer is activated and elapsed time a 
recorded on a precision time clock * in hu? 
dredths of a second until contact with the 
switch is broken. When the clock ae 
ing manipulation time stops, a second cloc; 
measuring travel time, starts, and continues 
to run until the next switch is touched. Thus: 
the elapsed time in operating any pattern © 
the switches is totalled separately for manipu” 
lation and travel movement components by 
means of the two clocks. 

The following types of scores may be alr 
tained on the test: (1) time involved in turn- 
ing the 25 switches; (2) time of travel be 
tween the 25 switches: (3) total time 1” 
volved in both manipulation and travel; 4” 


Model S1, Standard Time Clock, Standard Ele 
tric Time Company, Springfield, Massachusetts. 


136 


Dimensional Analysis of Motion: V. Psychomotor Ability 


ROTARY SWITCHES 


137 


CONTROL PANEL 


TERMINAL CONTROL SWITCHES 


ELECTRIC CLOCKS 


BATTERY BOX 


A 


CONTROL SWITCHES 


A SCHEMATIC DIAGRAM OF THE ANALYTIC REACTOMETER 


Fic, 1. Diagram of Analytic Reactometer showing the 
The inset illustrates the design of each manual control. The special mount for the 


ing mechanism. 
control panel makes it possible to position the panel 


(4) errors made in positioning the switches. 
The reactometer permits testing of the per- 
formance of either hand, with different planes. 
directions, and magnitudes of movement. To 
vary these dimensions of motion, different 
settings and patterns of switches may be 
used or the control panel itself may be 
changed from one plane to another. The 
whole test is constructed to be easily trans- 
ported. 

The main objective of this study has been 
to analyze, in terms of correlation procedures, 
the inferyplations between different reactive 


Vabiabt ets, hay ¢ pier inte perform- 
ang bles which «pic geia p 


.” On psyc ; fests. è ? 
"hab psychomotor test sforent. Ul 
Ne as per- 
de- 


nsi ty of scores related to € 
forn ONS and components of motion, 
in the test situation, has been 


d aa r -relations be- 
d ercort 
In addition, int and between 


ons of mo- 


tern: 
Emine 

ee r 
tests the components of motion ‘ 
tion ‘volving different dimensi 
ave been computed. 


Went 


he study 
Whice àt 


Y tests were carried out ™ 3 
Covered the following aspects al 
‘1) right and left directions ene 
or, (2) different directions a hand; 
(4) ett (3) performance with eac f mo- 
tion. OTizontal and vertical planes —_ of 
Mani and (5) simple and complex pat poar 
Pulation, All switches 0n 


arrangement of controls on the panel and the tim- 
in different planes. 


were used in each test. In all of the tests 
the manipulative movement consisted of a 
40° rotation of the switch, either right or 
left to positions 1 or 5 respectively. The 
travel movement from switch to switch was 
horizontal (left to right) in some tests and 
vertical (downward) in other tests. Com- 
plex manipulation patterns differed from the 
simple ones in that alternate switches were 
turned in opposite directions. It was not 
feasible to use a balanced sequence of the 
different tests to control practice effects. In- 
stead, for this preliminary study, the twenty 
different tests were administered to all sub- 
JEGES (tt LRE SLTIP SEGLENCE. 
78 college students served as 
sects fn the study. All 20 tests described 
7 naan given to each subject. Each test 
a approximately one minute to ad- 
The subject was instructed to 
switches on the panel in the pre- 
atterns as rapidly as possible and 
time to be careful to position each 
tely. When a new general pat- 
of motion was introduced, the subject 
tern wen. A practice trial on the first ten 
Ln to be turned in this pattern of mo- 


~ \ total of 


required 
minister. 
turn the 
defined P 
at the same 
switch accura 


y Reliability figures and intercorrela- 
tion. paeet components and dimensions of 
tions computed not only for each in- 


motion Were 


138 Shelby Harris and Karl U. Smith 


25 


20 


NO. OF SUBJECTS 


40 5 60 


—— RIGHT HAND MANIPULATION 
“as TRAVEL 


—:— LEFT HAND MANIPULATION 
ise TRAVEL 


70 80 90 IOO IlO 


SCORES IN SECONDS 


Fic. 2. 
and for the left hand in 78 subjects. 
identical, whereas the travel scores show some disc 
are slightly skewed positively, 


dividual test but also for combined scores of 
the various tests involving a common dimen- 
sion of motion. Of the 78 subjects, 49 re- 
peated the sequence of tests some 10 to 14 
days after the initial administration. Data 
obtained on these subjects are used to com- 
pute the test-retest reliability of the differ- 
ent measures obtained on the Reactometer. 


25 


N 
O 


NO. OF SUBJECTS 
© 0 


uo 


Distributions of manipulation and travel scores in the sim 


ple manipulation pattern for the right 


e 
The distribution of scores for manipulation by the fya Dina i 
repancy between the hands. Both pairs of distributi! 


Results 


Typical distributions of test scores are 
presented in Figures 2 and 3. The distribu- 
tions for the right and left hands shown in 
Figure 2 are based on combined scores = 
all tests involving simple manipulation 
terns. There were 8 of these tests for each 
of the hands. Figure 3 shows the analogous 


—— RIGHT HAND MANIPULATION 
a " ” TRAVEL 


== LEFT HAND MANIPULATION 
EEP TRAVEL 


n 


Dimensional Analysis of Motion: V. Psychomotor Ability 139 
A Table 1 
‘ 
Means and Standard Deviations for Different Combined Tests on the Analytic Reactometer 
First Test Second Test 
Manipulation Travel Manipulation Travel 
Test M o M o M o M o 
Hor. Plane 68.5 174 42.2 9.2 58.1 13.1 41.2 7.2 
Ver. Plane 59.1 14.2 40.2 8.5 53.0 13.0 41.1 7.9 
Right Hand 63.4 15.0 40.2 8.9 54.8 12.7 40.1 7.9 
Left Hand . 64.2 16.0 42.2 8.6 56.3 13.2 42.2 TS 
Lat. Direction 65.3 15.7 43.9 9.4 55.4 12.5 43.6 8.1 
Ver. Direction 62.3 15.3 38.5 8.1 55.7 13.1 38.7 Ta 
Manip. Right 63.9 15.4 41.0 8.8 55.6 12.8 41.3 7.8 
Manip. Left 63.8 15.3 41.3 8.4 55:5 12.8 41.0 TS 
Total Simple 127.7 30.5 82.3 17.1 111.1 25.5 82.3 15.1 
Total Complex 36.1 8.6 20.5 42 320 7.8 20.5 3.6 
Table 2 
Test-Retest Reliability of the Component Tests with Respect to Both 
Manipulation and Travel Scores 
Note: Each test was of one minute duration. 
Manipulation Travel 
` A I. Simple Manipulation 
A. Horizontal Plane 
1. R. H., R. Manip., Trav. R. -64 -60 
2. R. H., L. Manip., Trav. R. > 50 
3. L. H., R. Manip., Trav. R. n3 1 
l 4. L. H., L. Manip., Trav. R. 68 ‘31 
5. R. H., R. Manip., Trav. In 66 50 
6. R. H., L. Manip., Trav. In 28 "70 
7. L. H., R. Manip., Trav. In 3 ‘57 
8. L. H., L. Manip., Trav. In y j 
B. Vertical Plane 74 1 
9. R. H., R. Manip., Trav. Right ‘0 ‘6S 
10. R. H., L. Manip., Trav. Right ‘81 72 
11. L. H., R. Manip., Trav. Right 10 “50 
l 12. L. H., L. Manip., Trav. Right ” 17 ‘58 
13. R. H., R. Manip., Trav. Down 78 67 
14. R. H., L. Manip., Trav- Down 19 <69 
15. L. H., R. Manip., Trav. Down 4 66 
16. L. H., L. Manip., Trav. Down 3 
II. Complex Manipulation 
A. Vertical Plane F Jå 3 
| 17. R. H. R-L Manip., Trav. Right 17 
- RE, i cht i d 
18. L. H., R-L Manip., Trav. Rig 78 44 
' -L Manip., Trav. Down - 
, 19. R. H, R ’ Trav. Down 16 61 


20. L. H., R-L Manip., 


140 Shelby Harris and Karl U. Smith 


Table 3 
Test-Retest Reliability for Various Combined Scores 


Manipu- 
lation Travel 
Right Hand 81 75 
Left Hand 87 Pf 
Right Manipulation 85 81 
Left Manipulation 84 09 
Lateral Travel 83 -76 
Down and In Travel 85 75 
Horizontal Plane 81 73 
Vertical Plane 86 75 
Total Simple Manipulation 86 yi 
Total Complex Manipulation 86 09 


distributions for combined scores of four tests 
involving complex manipulation patterns. It 
may be seen from these distributions that 
both the manipulation time and travel time 
distributions are similar for the two hands. 
All of the distributions approach normality. 

The test and retest means and standard 
deviations for various combined scores are 
given in Table 1. Each of the combined 
scores is based on the performance of 49 sub- 
jects for all of the tests which involved the 
specified dimension. With the exception of 
the compound score of all tests involving 
complex manipulation, all of these figures 


are based on the tests involving simple 
manipulation patterns. Comparison of the 
means for various dimensions of motion is 
not justified due to the lack of control over 
practice effects. 

Tables 2 and 3 present the test-retest re- 
liability figures for the twenty individual tests 
and for the various combined scores. The 
reliability figures for the individual tests are 
presented in the order that the tests were ad- 
ministered. The combined scores on which 
the reliability figures in Table 3 are based 
are the same as those of Table 1. All of the 
reliability values are relatively high. The 
manipulation-time coefficients are consistently 
higher than the travel-time values. 

Table 4 shows the correlations between the 
manipulation and travel components of mo- 
tion for the combined scores and the correla- 
tions between the several dimensions of mo- 
tion involved in the study. All of these 
figures, which are based on data of 78 sub- 
jects, are positive coefficients. It is obvious 
from the table that the relationship between 
the components of motion is consistently 
low. Nine of these coefficients are signifi 
cantly different from zero at the one per pa 
level. One is significant at the five per cen 
level. The correlations between dimensions 
of motion are high for both manipulation an 


Table 4 


Correlations Between Components and Dimensions of Motion 


r + jons 
Correlation between Dimens 


Man. vs. Tray, Manipulation Travel 

Right Hand 37** ‘a 
Left Hand 31" ual j 
Right Manipulation ‘ahd 

Left Manipulation pe par a 

& 3i 

Lateral Travel 41** 

Down and In Travel ‘20** sae" ail 
Horizontal Plane 

Vertical Plane a 85** il 
Total Simple Manipulation 36** 

Total Complex Manipulation ogg 93** i 

25 i 

in Significant at 5% level, i 


Significant at 1% level, 


4 


Dimensional Analysis of Motion: V. Psychomotor Ability 


travel components of motion. Among the 
correlations between dimensions, the values 
for planes of motion are somewhat lower than 
those for other dimensions. Generally, the 
intercorrelations between dimensions of mo- 
tion are higher for the manipulative aspects 
of motion than those for the travel com- 
ponents. 


Summary 


A special psychomotor test for separate 
measurement of the travel and manipulation 
components of motion has been described. 
The test, called the Analytic Reactometer, 
permits controlled variation and measure- 
ment of different bodily and space dimen- 
sions of motion which are involved in various 
types of motion patterns. 

Preliminary investigation employing the in- 
strument have yielded the following general 
results: 

1. Critical sources of variation in perform- 
ance in various motion patterns of the type 
studied are related to the manipulation and 
travel components of motion. 

2. Performances in different space dimen- 
sions of both manipulation and travel move- 
ments correlate highly with one another. 

3. The reliability of specific tests related 
to hands, planes, direction of travel, direc- 
tion of manipulation and complexity of the 
manipulation pattern in the general test situa- 


141 


tion described typically exceeds + .80 for 
manipulation and + .75 for travel move- 
ments. 

4. The present test, and the principles be- 
hind it, provide one means of securing pre- 
cise and analytical data for exact quanti- 
tative specification of motions and motion 
functions. Application of analytical methods 
described to studies of growth, aging, neuro- 
logical deficiency, and to industrial selec- 
tion may advance considerably the scientific 
validity of data concerning human motion. 


Received June 13, 1952. 


References 


1, Davis, R., Wehrkamp, R., and Smith, K. U. Di- 
mensional analysis of motion: I. Effects of 
laterality and movement direction. J. appl. 
Psychol., 1951, 35, 363-366. 

. Lincoln, R. S., and Smith, K. U. Systematic 
analysis of factors determining accuracy in 
visual tracking. Science. In press. 

3. Rubin, G., Von Trebra, P. A., and Smith, K. U. 
Dimensional analysis of motion: III. Com- 
plexity of movement pattern. J. appl. Psy- 
chol., 1952, 36, 272-276. 

4. Smith, K. U., and Wehrkamp, R. A. Universal 
motion analyzer applied to psychomotor per- 
formance. Science, 1951, 113, 242-244. 

3. Wehrkamp, R., and Smith, K. U. Dimensional 
analysis of motion: II. Effects of travel dis- 
tance. J. appl. Psychol., 1952, 36, 201-206. 

6. Von Trebra, P. A, and Smith, K. U. Dimen- 
sional analysis of motion: IV. Transfer effects 
and direction of movement. J. appl. Psychol., 
1952, 36, 348-353. 


i 


THE JOURNAL or AppLIED Psycuonocy 
Vol. 37, No. 2, 1953 


Applied Psychology in Action 


Editor’s Note: With this issue, we begin 
what may become a regular feature of the 
Journal of Applied Psychology. We plan to 
publish brief descriptions of applied psy- 
chology in action to be written by psycholo- 
gists who are applying psychology in real 
life situations. Brief news notes concerning 
applied psychology in action from a variety 
of sources will be published. Descriptions 
of procedures and techniques believed to be 
effective, even though desirable experimental 


controls may not have been possible, will be 
included. Thus, a forum for the interchange 
of practical information will be provided 
practitioners of applied psychology. In part, 
this new feature of the Journal of Applied 
Psychology attempts to meet the challenge 
contained in Dr. Marion A. Bills’ presidential 
address before the Division of Industrial and 
Business Psychology last September. It 18 
appropriate, therefore, to begin with the publi- 
cation of her provocative address. 


THE JOURNAL OF APPLIED PsycnoLocy 
Vol. 37, No. 2, 1953 


Our Expanding Responsibilities * 


Marion A. Bills 


Aetna Life Insurance Company, Hartford, Connecticut 


Three items lead me to the choice of the 
title for this talk: (1) the most interesting 
diaries which many of the psychologists who 
are working full time in industry have kept 
for two weeks and sent in as a foundation 
for a case book in industrial psychology; 
(2) a meeting which I attended of psychia- 
trists and psychologists working in industry 
which was held in Asbury Park this spring; 
and, (3) the criticisms which have been 
made, some in writing and many in dis- 
cussions, that our published research is at 
a very superficial level, 

The diaries which we have 
individuals in Private indu: 
that our duties s 
management, 
vision between 
the Setting of 
to do with the 


received from 


S time wa: 
spent on where to locate a new plant with 
emphasis on labor Procurement, The diary 


* Presidential address deliy 
1 ered bi ivi 
of Industrial and Business Paycholeee oa 
Meeting in Washington. DE 


ended, “If I sent you one three months at 
now it would be entirely different.” Wi 
Wage Stabilization still in force how to keeP 
within the law and still run a business T 
occupying 70 per cent of one psychologist $ 
time in a mid-western company. One par 
son delayed sending in his diary until anig 
negotiations were over because he had be 
nothing “psychological” (this is a direct qU° 
tation) for the month he had been handling 
the negotations, to 
Many of the diaries varied from day am 
day. They included conferences (we ae 
to run to conferences) on wage systems, ine 
cluding merit rating, conferences on tra to 
ing ranging from salesmen to supervisors od 
hourly workers and including such deai 
items as the purchase of an opaque projec e 
for use in safety training—discussions 0” fer- 
editorial policy of a house organ—con a 
ences on pension systems, and how to P w 
pare the individual for retirement and 5° ce 
actual work with the individuals—attendane 
at a meeting on a proposed Stock Purc iA 
Plan and so through the entire range of ™ ika 
agement. Throughout most of the de 
Was an occasional hour or two spent dire 


b- 
or actually doing work on research P!° 


—— i O 


Applied Psychology in Action 143 


lems and one sensed that in many instances 
there was a desire to do more—time only 
being lacking. One of the diaries ended with 
an hour devoted to consulting with one of his 
assistants on a research problem of selec- 
tion and then honesty prevailing he added a 
note, “This hour is really wishful thinking; 
it was only 15 minutes.” 

The great volume of managerial work that 
we do and which was clearly brought out in 
the diaries was pointed up for me at the 
meeting of the psychiatrists and psycholo- 
gists at Asbury Park. You can count on the 
fingers of one hand the number of psychia- 
trists in private industry and of these few, 
only one was talking of management prob- 
lems and he apologized for his interest. The 
psychiatrists, whether they be dealing with 
charwomen or the president of a company, 
were all talking of individuals as individuals. 
Our interest in groups and in organizations 
was entirely lacking among them. This is a 
ball which we are apparently carrying alone. 

How and why have we gotten ouselves into 
this situation for we are in it much more than 
doctors, lawyers or engineers. First, I be- 
lieve because we are a newer science and our 
field is much less defined. A problem must 
have at least a medical tinge before manage- 
ment goes to a medical department but psy- 
chology being a bit vague in the mind of 
management they feel free to turn to a psy- 
chologist on almost any problem. Second, 
because by and large we have felt rather com- 
Plimented to get into many managerial func- 
tions and have taken them on willingly. 
Third, I believe the most important is that 
as we go into managerial work we carry with 
us many fundamental psychological princi- 
ples, and so influence management in the way 
that as psychologists we feel they should be 
influenced and our influence is greater be- 
cause we do not wear a tag which says “psy- 
chologists.” What are these principles that 
we carry over? I believe one of the most im- 
portant is the principle of “Stop, look and 


Le” sk eget 

into us in all of ot (UM ing he 0 

as long been accustomed eine operat 
t ma 


o : 
= financial problems, 07 pi ny 
Costs in factory upkeep, etC., 


their judgments on people have been on a 
random basis of single cases—rumor and 
prejudices. I remember 30 years ago one 
man who had built up a big business selling 
office machines told me that all black-haired 
men were dishonest; at that point no amount 
of pointing out honest black-haired men had 
any effect on his prejudice and yet almost 
any one of us given a year or two and some 
tact could have worn him down and at least 
improved his evaluation of personnel. In 
any company it is a long selling job that per- 
sons’ reactions can be studied on a scientific 
basis—that persons can be selected for any 
job with a fairly accurate prediction of suc- 
cess or failure—that they will react in certain 
ways to certain types of training—that what 
they want can be determined—that fair wage 
rates can be established that will take into 
consideration the difficulty of the job and the 
efficiency of the individual and will cause at 
least 50 per cent of the personnel to say the 
company is fair. One has only to make one 
or two sales of this type and they may at the 
beginning take a long time until one becomes 
a part of management. This is what I think 
has happened to us. We have made the 
sales. As management has grown to realize 
that their personnel is their chief asset, the 
person that can tell them about that person- 
nel has been drawn into decision making 
functions. 

It’s a long selling job because one must not 
only convince top management who probaly 
were already favorable before we were hired, 
but we must sell the idea all the way down 
the line that the scientific approach is going 
to make each person’s work more effective and 
take not a whit away from his own responsi- 
bilities. 

We have learned a great deal over the 
years; perhaps more than we have given and 
much more than we realize, but the final re- 
sult has been beneficial to both management 
and ourselves. Let us give an example. Our 
first study of the interview was a debunking 
of it. We showed very successfully that the 
average interview was a very weak tool for 


selection of personnel. For example—you all 
yemember those experiments by Hollingworth, 
where 10 sales managers interviewed 20 tién 


144 Applied Psychology in Action 


and if each picked the two that he considered 
best, 18 would have been chosen. But where 
did this most interesting scientific experiment 
get us? Practically nowhere! As psycholo- 
gists we got excited but sales managers said, 
“How interesting” and went right on picking 
salesmen by interviews only. Then, grad- 
ually we modified the approach. In substance 
we said, “The interview is the tool by which 
the final decision must be made but what 
information concerning the individual can 
we give you—the sales manager—that will 
help in this final decision?” With research 
we showed that some tests were helpful—that 
there were certain ways of scoring an applica- 
tion blank that came out with an indication 
of success or lack of success. Management 
bought the results and we had learned to 
play on the team and playing on the team 
we could gradually make suggestions which 
changed somewhat the type of interview and 
helped to make it as an interview more suc- 
cessful. 

We have a mighty heritage of at least 
seventy-five years of psychological research 
back of us to which is constantly being added 
new and valuable data and ideas. Much of 
it is written down in our literature. Some 
has been handed to us by word of mouth 
from our professors and colleagues. It is well 
worth using and we are using it but the cri- 
ticism that our Publications as psychologists 
in private industry have been too few and 
too superficial is Probably just. 

For example; for at least 
of us have been worrying ab 
foreman. J think we hay 

our own companies both on the 
Policy setting level and wi 
foreman or Supervisor, but it's the academic 
man who writes about it. We d 3 
ticularly like i ro pi 
lamë too es—he puts the 
the many complicati 


three years many 
out the frustrated 


resentful. We talk about 


Preaching conte: 
we do not write 
aS we see him. 


say the subject is too big or too hot, and 
what. we write, if we write at all, is a small 
statistical study of how to select or rate the 
bench worker or the file clerk. Of course, 
being a psychologist I now modify my state- 
ment and say that there are exceptions, and 
some of these exceptions are outstanding. 
However, on my desk as I wrote this was the 
December, 1951 Journal oj Applied PE 
chology, and the Winter Number of Personne 
Psychology and with the everlasting compul- 
sion of a psychologist to count (we all seem 
to have this compulsion), I counted ug 
articles in these two magazines. A agate 
14 articles were from persons connected wit 
colleges—5 from the military force—2 from 
Consulting firms—and one made up of two 
junior authorships from persons in privet’ 
industry. I think this is fairly typical ane 
certainly our showing is not good, and a 
friends in the consulting field although they 
did twice as well as we did, cannot pat them- 
selves on the back too much either. Together 
we contributed only a sixth of the articles a 
the two magazines where you would expec 
us to make the best showing. ai 
Why then the big difference between whe 
we are doing and what we are reporting‘ : 
There are many reasons but may I rt 
trate a few and I am asking you to bear poe 
me while I quote an experience of my i 
so far back that it no longer has a pero 
connotation. Twenty-five years ago we had j A 
experiment on sound-proofing, by installing 5 
sound-proof ceiling in a department for pe 
we had good production records, and Late 
continue those records after the installatio = 
Based on the results we spent a half pei 
dollars sound-proofing a new building. 1 an 
fully convinced that the decision was aaa 
but I never published the results. The asa 
amateur statistician could have shot - 
full of holes; a fear complex partly, but _ 
a recognition that some results cannot : 
accurately measured. As one talks to eee 
chologists in Private industry one hears a 
often. I know one psychologist who has ae 
uP a new training program for supervise 
He is sure it js successful but he has id. 
measured results, Tn discussing it he i 
“Tf I had only thought to count the num 


= 


Applied Psychology in Action 


of frowns I got in the department before the 
training, and the number of smiles I get now, 
maybe the data would be statistically valid 
at at least the 5 per cent level.” At heart 
we are still strictly scientific. Practice has 
forced us to make decisions on bases which 
cannot be scientifically proven. We have 
learned that a workable solution on time is 
Worth more than a perfect one too late. so 
we don’t publish. 

We are not alone in this dilemma. Dr. 
a of the National Industrial Health 
Be vice, at the meeting of Psychiatrists and 
Sychologists at Asbury Park told us of 
Many health projects set up by industry, and 
Pleaded with us for some way of measuring 
= success. We cannot usually in industry 

t up experimental controls. For example, 

r. Cameron in talking to me said, “You 
ho a visiting nurse? Does she pay for 
erself?” T am sure she does but, I can’t 
ee it. Of course under laboratory condi- 
a S we could set up controls; we could give 
Ra services to half of our office force and 
Ot to the other half. Then we could keep 
ae of absenteeism, turnover and even make 

Orale surveys for the two halves; and per- 
rt come out with proof but can you see 
Pus company being willing to set “i 
aad I certainly cannot see MY ; 
a Much less advising the Aetna, to Pii i 
a such measures to prove something, which 

already think we know. 

'erhaps we are still too conscious of our 
p ctitage that any idea to be worth publish 

i must represent research and valid os 

thaps knowing the complications o 

an behavior we become sO involved when 

e opu- 

tty to write in general terms and for pop 


“ifs an 
Ctisumption we put in so many te 
buts» saiepion. wepu? d and leave the 


Writin sts who can go 
all oy forget the com- 
Plicat conscious. Per- 
publish to ad- 
e are fairly 
o not take 
our thinking 
to rea 


that we get discouras¢ 
g to the nonpsychologi 
t for a given plan and 
ions of which we are 50 
a since we do not need tO 
tg in our work, and since 
eye get a little ber a 
e and en to C 
“nd put it down on pabet for others 


145 


and maybe profit by. But the fact remains, 
—we do not publish. P 

In talking with our Medical Department 
they tell me that they have two types of 
journals—one of a strictly scientific nature 
where 80 per cent at least of the contributions 
come from research centers, and another type 
where the contributions are mostly from prac- 
ticing physicians, and maybe reports of single 
cases, or small groups of cases. No one ex- 
pects them to be valid research articles but 
they are often very suggestive. 

Is this our solution? Perhaps, but before 
it can work we must change some of our psy- 
chological thinking. It’s a very thin and 
sometimes wavy line between the funda- 
mental concepts that form at least a part of 
our contribution to management and our ac- 
ceptance of the less rigid principles of proof 
that prevail in the management field. Would 
our publications in this less rigid field add 
anything to the fundamental knowledge of 
psychology? Perhaps the trees are so thick 
that we do not see the forest and perhaps we 
have got to wait until some of our colleagues 
who have been through the experience retire, 
and getting at a distance, which gives them 
an objective viewpoint, become our spokes- 
men. Perhaps our real function is that of a 
liaison officer between our experimental work- 
ers and management under which function 
our chief duty would be to keep very well 
informed on both sides, and display the in- 
to connect them, even when in many 
cases the connection is far from obvious. 
I know.of one case where a strictly experi- 
mental study by Berth and Rabinowitz on 
the two cord problem helped to set up a 
change in a sales training course for sales- 


genuity 


men. . 
I realize that this talk has been full of 


“perhaps,” which means that questions have 
been raised, and no conclusions reached, but 
psychologists in private industry are only 
about 100 strong and we need the advice of 
our consulting friends and especially of our 
academic ones to help us see clearly where 
our greatest contribution to a young but 


fast growing science lies. 


146 Applied Psychology in Action 


Calling in Psychologists Early 


A few years ago, military men found that 
a lot of new weapons were getting too com- 
plicated for the men who had to operate them. 
Industry found much the same thing with 
new plant equipment. Electronic and me- 
chanical devices had been added so fast that 
the human mind could not keep up. Ap- 
plied psychologists were called in to “hu- 
manize” the machines, 

Shortly after the end of World War II, the 
Psychologists were turned loose on the nearly 
complete designs for new machines to be used 
for military production, At that point, about 
all the Psychologists could do was change 
dials for easier reading, color or illumination 
for less eye Strain, size or shape of knobs 
and wheels for easier identification, and a 
few other minor things that would not hurt 


the over-all engineering. Obviously _ this 
helped some, but it was not enough. Private 
industry was even slower to tackle the prob- 
lem, largely because of the expense involved 
in re-designing machines. 

To remedy this, consultant firms, such as 
Dunlap & Associates, Inc., of Stamford, 
Conn., are campaigning for a place in the 
early design stages of equipment develop- 
ment. Most government groups favor this 
new approach, but industry, in general, is 
skeptical. Industry seems to agree that 
more consideration should be given to the 
human factors early in the design process. 
But it does not think that design engineers 
are going to be happy about psychologists 
butting into the blueprint phase of the prob- 
lem. (Business Week, December 20, 1952.) 


— — 


Book Reviews 


Wolfle, D., Buxton, C. E., Cofer, C. N., 
Gustad, J. W., MacLeod, R. B., and Mc- 
Keachie, W. T. Improving undergraduate 
instruction in psychology. New York: 
Macmillan, 1952. Pp. vii + 60. $1.10. 


Surely it is now even more true than when 
H. G. Wells made the statement, that there 
is a race between education and catastrophe. 
The committee making this report is able, 
headed by a man who was probably closer 
than any other to the work of psycholo- 
gists during the war and in the immedi- 
ate post-war period. From this group might 
therefore be expected a program having as a 
major feature, a broad and stimulating view 
of the vital importance of psychology in the 
present-day world. 

But national and world affairs seem not 
even mentioned in the volume. One would 
never know there had been a world war! The 
first chapter, on objectives, emphasizes “the 
contribution which psychology can make to 
a liberal college education,” but the concept 
of such education seems formal and remote 
from the current scene. Nor should the 
major objective of first work in psychology 
be to foster students’ “personal growth and 
increased ability to meet personal and so- 
cial adjustment problems adequately.” A 
four-page chapter on “Personal Adjustment 
Courses” declares scornfully that “it is no 
more justified to consider such a course as a 
course in psychology than it would be to 
substitute .. . a course on household re- 
Pairs for introductory physics” (p. 41)- And 
Courses which “deal with special interest 
ae or purport to provide technical train- 

8” are only tolerated; there is admiration 
for “a few conscientious departments, de- 
termined to provide the best possible traim- 
ing for students, which have recommended 
that such courses be eliminated, even at the 
risk of decreasing enrollments and displeasing 
other departments” (p. 24). The major 
chapter, on “The Recommended Curricu- 
lum,” urges a first course giving “a sys- 
tematic presentation of scientific content 
followed by core courses on “motivation, per- 
ception, thinking and language, ability,” and 
advanced courses in social psychology, physi- 
ological psychology, etc. A two and a half 


147 


page chapter on “Technical Training in 
Psychology” suggests that after such an 
undergraduate program, “a few months of 
full-time vocationally-oriented training in a 
post-A.B. institute could give the student a 
battery of job skills” (p. 44). And the con- 
cept of “liberal college education” becomes 
fairly clear: a program in which psychology 
need feel no responsibility for world or na- 
tional or community problems, or student wel- 
fare, or vocation—and may smugly go its 
own self-centered way. 

A six-page chapter on “Implementation of 
the Curriculum” points out (for instance) 
that, though the proposed program may bring 
some reduction in number of courses, staff 
can be absorbed by laboratories. A final brief 
chapter on “Research Problems underlying 
the Curriculum” suggests (for example) ap- 
praisal of the first course by number of 
students taking further work in psychology, 
and touches briefly on methods of instruc- 
tion; in spite of its title, the volume deals 
with this last topic only incidentally—is given 
over to emphatic declaration for a systematic 
theoretical undergraduate program, and im- 
patient belittling of alternatives. 

Such a partisan position can be adequately 
appraised only by a balancing comparison 
with alternatives. Presumably an alterna- 
tive report might emphasize that indeed “wars 
begin in the minds of men,” that psycho- 
logical warfare is more powerful than the H- 
bomb, that psycho-social problems are major 
in any nation and any community—and that 
psychologists should courageously do any- 
thing they can to bring general understand- 
ing of these issues. It might be proudly con- 
fident that psychology had much to offer 
students in better understanding themselves 
and their problems, and that such help could 
be a vital part of a broad systematic treat- 
ment of the subject. It might be exhilarated 
by the vocational usefulness of much psy- 
chological material, and find therein enrich- 
ment of its essential subject-matter. Instead 
of a statement which many would have con- 
sidered conservative and professionally in- 
trovert thirty years ago, it might be a docu- 
ment which would give college administrators 
and faculty in other departments a stimu- 


148 


lating view of psychology as a science rapidly 
advancing and eager to cooperate in efforts 
to build educational programs more fully 
meeting the problems of the present world. 
The one-sidedness and inadequacy of the 
present little volume seems to the reviewer 
to emphasize a need for such an alternative 
document. Lacking it, the hope must be 
that the Education and Training Board may 
take a position more positive and forward- 
looking. 


Sidney L. Pressey 
Ohio State University 


Hirsh, I. J. The measurement of hearing, 
New York: McGraw-Hill Book Co., 1952. 
Pp. ix + 364. $6.00. 


This book is concerned with the informa- 
tion about acoustics, electro-acoustic equip- 
ment, psychology of hearing and related topics 
that is basic to both the clinical and experi- 
mental measurement of various aspects of 
hearing. Written by an experimental psy- 
chologist thoroughly acquainted with psy- 
chophysical the treatise stresses 
It is designed 
for use as a teference by those engaged in 
ng disorders, as 
a text for those Preparing for this kind of 
nce material for 
ng. 
chophysical meas- 
discussion of the 
lectricity basic to 
on and measurement of audi- 
These Principles are then 


Applied psycholo 
Mainly in the sect; 
plications, 


Job of organizati ` 
highly techni Position with 


r, th 


3 ` 
Book Reviews 


must not expect to absorb the material with- 
out concentrated study. This excellent book 
is a must for anyone preparing to do research 
in the measurement of hearing as well as for 
all those with a serious interest in the field. 
Readers will be especially grateful for the 
completeness of the information that aom. 
panies the figures and for the glossary 0 
technical terms. It is probable that the 
book will find its greatest use as a reference 
work on techniques of measurement and clini- 
cal applications. 
25 Miles A. Tinker 


University of Minnesota 


Campbell, C. M. (Editor). Practical la 
plications of democratic administration: 
New York: Harper & Bothers Publishers, 
1952. Pp. 325. $3.00. 


Generally speaking school ädministrato 
in the United States have voiced their ee 
legience to the broad concepts of demoen 
How these concepts can be given vehicle m 
schools and school administration is A 
clear to many superintendents and pupei 1 

Practical Applications of Democratic e 
ministration represents in condensed form | si 
thinking of a dozen scholars on this ae 
Their contribution is predicated upon a SOU e 
philosophical basis, which turns early to ae 
tical applications based upon research @ 
experience, ' jon is 

Leadership in educational administratio pa 
the ribbon which binds the separate, Sil 
tributions together in a package which $ "tot 
be both appealing to, and much sought ap- 
by, all professional educators. The two logy 
ters dealing with sociology and psycho o 
are strikingly illustrative of the integrat on 
broad fields of understanding which These 
cratic leadership must draw upon. t-day 
Same two chapters offer to many presen con- 
school administrators a challenge for have 
sidering more realistically than they their 
been accustomed to, the social forces ri - 
school communities and the vestiges ° nave 
tocracy which individuals and groups 


inherited, i 


y 
Readers are not left with a “so Le 
attitude, because seven chapters follow, ory 
mediately and describe in clear expos ace 
Style actual administrative leadership Pi. 


; i 
tices in a number of school commu” 


a Te 


al 


Book Reviews 


At times it is somewhat difficult to align the 
separate examples with specific prongs of the 
foregoing theory. This is, however, under- 
standable because democracy does not pro- 
pose to “blueprint” practice. Instead the 
stream of democracy is fed by leadership 
from local tributaries each carrying in sus- 
pension particles unique to its own fields of 
origin, 

The role of administrative leadership is not 
easy as pointed out in the concluding chap- 
ters, yet the present volume is filled with the 
necessary materials. Although democratic 
idealism seems to have swept the country, 
the next step suggested for educators is to 
reach consensus on somewhat more specific 
points of both theory and practice. It is un- 
likely that this can be accomplished in a 
setting which is pessimistic. 

The implication is clear that a science of 
human engineering coupled with educational 
statesmanship has germinated and is in seri- 
ous need of cultivation. Neglect in this area 
points to a spotty blighted harvest that will 
represent only a fractional part of the poten- 
tial. Therefore, carefully formulated experi- 
mental programs like those cited in this brief 
volume should become the usual rather than 
the unusual practice of current and future 
educational leadership in our democratic so- 
ciety. To do this will take school superin- 
tendents and principals away from many of 
their present routines and “behind the desk” 
management activities out into the com- 
munity, This will not sell the school short, 
however, because the community will bring 
back to it a richness otherwise unattainable. 


Hugh M. Shafer. 
School of Education, ` 
University of Pennsylvania 


Dooher, M. J., and Marquis, Vivienne (Eds.)- 
The development of executive talent. New 
York: American Management Association, 
1952. Pp. 576. $6.75. ($5.75 to AMA 
members.) 


For anyone who wants to develop a cen- 
tralized planned economy in the United 
States, this book will offer little of value for 
it is oriented to: “Management's role in the 
preservation of a free society is putting the 
real meaning of a free society to work within 
the organization for which each individual 


149 


executive is responsible. The basic objective 
is the development of individuals. . . . The 
basic purpose of management is absolutely 
consistent with that of a free society, and 
the individual manager’s responsibility is to 
work that way.” 

For anyone who has negative reactions to 
“management-minded” research and publica- 
tions or who feels that business management 
as a function in our society is part of our 
reactionary past, this book will offer little of 
interest. For this book is based upon the 
principle of “management as a profession, a 
science, and an art.” It recognizes that 
management has an aggressive, dynamic role 
to play in the major national and interna- 
tional struggle between two ways of life, but 
that the successful performance of this role 
is greatly dependent upon the development 
of capable leadership. The purpose of the 
book, then, is to bring together the produc- 
tive experiences and practices of many or- 
ganizations in their efforts to produce man- 
agement and executive personnel who will 
function most effectively in achieving the 
goals of a free society. 

From this it should not be inferred that the 
book is either political or controversial. The 
main body of the book is divided into nine 
parts, consisting of 50 chapters, contributed 
by 44 authors from business and educational 
organizations. The subjects covered include: 
setting up the program—basic principles and 
practices; organization planning; putting the 
program into action; conference training 
methods; special approaches, techniques and 
programs; getting results from follow-up 
counseling; program evaluation; and trends 
in management development. The remain- 
ing pages, consisting of about one-half of the 
entire volume, are devoted to case study re- 
ports of methods used by such companies as 
Standard Oil (N. J.); United Parcel Serv- 
ice; Sears, Roebuck and Co.; Detroit Edison; 
U. S. Rubber; Westinghouse; and others. In 
addition, there is an extensive bibliography 
of approximately 400 items divided under a 
variety of sub-topics relating to leadership 
and management. 

Most of the chapters offer a combination 
of practical “how to do it” material and dis- 
cussions from the experimental literature. 
Although some academicians may be disap- 


150 Book Reviews 


pointed in a large part of the material, it does 
present a carefully planned compromise ap- 
proach for the practicing businessman and 
the classroom educator. While no step-by- 
step solution to individual problems is offered, 
a pattern of action is noted. In the final 
analysis, the book contains specific, prac- 
tical guidance on all of the problems involved 
at every stage of planning and administra- 
tion—from the analysis of needs, through the 
discovery of latent executive ability, to the 
inventorying, rating and development of 
executive skills. 


C. G. Browne 
Wayne University 


Judd, D. B. Color in business, science, and 
industry. New York: John Wiley & Sons, 
Inc., 1952. Pp. 401. $6.50. 


The measurement and specification of color 
have undergone great advances during the 
past thirty years. This book should prove a 
fruitful venture because, during the period 
of maximum development in colorimetry, the 
author has become an outstanding authority 
and has held a strategic position at the Na- 
tional Bureau of Standards. The scope of 
the book is broad and records everything that 
has appealed to the writer as pertinent or 
Interesting in respect to color. Such a treat- 
ment cannot be expected to be complete since 
the thinking of the author is set down as final 
without consideration of alternative facts and 
theories, 

The background of the book is physical 
both in the material presented and in the 
point of view. Other influences have made 
but have not altered the 
for example, is men- 


The Presentation is jn t 
further co; i 


which have entered into the author’: think- 
ing on the subject. The treatment is , unsys- 
tematic. Materials of physiological, plyysical 
and psychological character are intermimgled. 
No distinction is made between data) and 
hypotheses and the philosophy is that of 
naive physical realism. Nevertheless, the: ex- 
position proceeds from physiology of the eye 
to the tristimulus mixture of colors including 
radiation by the way. It is difficult to assess 
the pertinence of various topics. The reader 
is forced into an item by item evaluation 
which must be confusing to a novice in the 
field. It would have been sufficient for the 
remainder of the book had the author limited 
the introduction to a statement of the tri- 
stimulus hypothesis and the development of 
its colorimetric implications. As the discus- 
sion stands, it is not clear precisely what the 
author himself holds to be the “three color 
hypothesis.” He states that “we have seen 
that normal color vision is tridimensional 

(p. 67), but whether this derives from ‘ oe 
fact that we get three independent kinds S 
information from the cones, light-dark, red- 
green, yellow-blue” (p. 18) or from the hy- 
pothesis that “some of the cones contain 
short-wave absorbing (V) pigment, some con- 
tain a preponderance of long-wave absorbing 
(R) pigment, and some contain a preponder- 
ance of middle-wave absorbing (G) pigment 

(p. 18) is not indicated, 

In Part II we have the meat of the book 
under the title “Tools and Technics 3 
colorimetry. It occupies some two-thirds, ve 
the text and gives precise and comprehensi er 
information needed to carry out ae 
ments of color and to specify them in one 2 
the alternative systems of notation. The ed 
viewer considers this work one of the oh 
Standing presentations of colorimetry, af 
that may very well become the standard © 
erence in the field. the 

It. is regrettable that the author felt ess 
necessity to go beyond this field. oe 
and industry will hardly be concerned wine 
the physiological hypotheses of vision T gy: 
Psychophysical techniques of psycho web 
Moreover, no one can expect to speak 
authority on all subjects. es 

Forrest L. Dimmick 


Medical Research Laboratory; 
ew London, Connecticut 


U. S. Naval 
N 


New Books, Monographs, and Pamphlets 


Books, 
ooks, monographs, and pamphlets for listing and possible review should be sent to Donald G. Pate 
. rson, 


Editor, Department of Psychology, 


me of consciousness. H. A. Abramson, Editor 
ote Macy, Jr. Foundation. New York: 
oe Macy and Co., 1952. Pp. 153. $3.25. 
ereti of living things. M. S. Anderson. New 
State: Philosophical Libeaty, 1952. Pp. 202. $2.75. 
T Fpi theory in research, R. L. Anderson and 
a Bancroft, New York: McGraw-Hill Book 
Ree Inc., 1952. Pp. 399. $7.00. 
Aia relations and administration. Kenneth R. 
Pre rews, editor. Cambridge: Harvard University 
An pis 1952. Pp. 271. $4.50. 
introduction to field theory and interaction theory. 
hris Argyris. New Haven: Labor and Manage- 
ong Center, Yale University, 1952. Pp. 71. 
pai for a brain. W. Ross Ashby. New York: 
Tbe Wiley and Sons, Inc., 1952. Pp. 259. $6.00. 
si | prodigies. Fred Barlow. New York: Philo- 
ye leal Library, 1952. Pp. 256. $4.75. 
l S research. James H. Batchelor. 
$1.00 Case Institute of Technology, 1952. 


X 


Cleve- 
Pp. 95. 


Clyde Bedell. 


OW to zyr 
W to write advertising that sells. 
Inc., 1952. 


New York: McGraw-Hill Book Co. 
q P 519. $6.00. 

manual for the differential aptitude tests. George 
. Bennett, Harold G. Seashore, and Alexander G. 
ti esman. New York: The Psychological Corpora- 
‘ae 1952. Pp. 77. 

Bae hygiene for classroom teachers. Harold W. 
Tne ds, New York: McGraw-Hill Book Co., 
p. +5 1952, 


Due Pp. 472. $4.75. 
"ctical psychology. Karl S. Bernhardt. New York: 
The Otaw-Hill Book Co., Inc, 1953- 
pa frmities of genius. W. R. Bett. New York: 
‘ ilosophical Library, 1952. Pp. 192 $4.75. 
philosophy of social work. Herbert Bisno 


ashington, D. C.: Public Affairs Press, 1952. 
Riri 139. $3.25. 
[rch in the training of teachers. 
Oronto: J. M. Dent and Sons, 
Gener, $1.90. 
al psychology. Revised edition. Robert Ed- 
joss Brennan, New York: The Macmillan Co- 
2. P 
Appi; p. 524, $5.50. 
pond psychology. Harold Ernest Burtt. New 
paok: Prentice-Hall, Inc, 1952- Pp. 480. ew 
Psychology of learning. James Deese. k 
308, McGraw-Hill Book Co. Inc. 1952. Pp. 
2 on $5.50. 
in mee of nonacademic perso 
pai eter education. Donald E. 
1982" Donald E. Dickason, 8 
he Gy bP 52- $2.00. 
Was o social 
2 shington, D. C.: Pu 
P. 118. $2.50. 


Henry Bowers. 
Ltd., 1952. Pp. 


nnel administration 
i Cham- 


St., 


1 action. Seba Eldridge- 
plic Affairs Press, 


151 


University of Minnesota, Minneapolis 14, Minnesota 


The scientific study of personalit 
y. H. J. Ey: k 
New York: The Macmillan Co., mae es 
chs: . Pp. 298. 
Deaf children in a hearing world. Miria 
J . am F 7 
Fiedler. New York: The Ronald Press Co ind 
Pp. 320. $5.00. eee 
Let’s hear it! George W. Frankel. New York: 
Stratford House, Inc., 1952. Pp. 63. Hal om 
Understanding old age. Jeanne G. Gilbert A New 
York: The Ronald Press Co., 1952 S 
a s . Pp. 442. 


Problems of the family. Fowler V. H 
dianapolis: Bobbs-Merrill Co., Inc., 1082, = 
850. $9.00. cis 


Appraising personality. Molly Harrower. New York: 
W. W. Norton and Co., Inc., 1952. Pp. 197. poli 
The fundamentals of social psychology. RUEEnE L 
Hartley and Ruth E. Hartley. New York: Alfred 
A. Knopf, Inc., 1952. Pp. 832. $5.50. 
Psychoanalysis as science. Ernest R. Hilgard, Law- 
rence S. Kubie, and E. Pumpian-Mindlin l Stan 
ford, Calif.: Stanford University Pr 1 ; i 
158. $4.25. BY tases tees TED 
The measurement of hearing. Ira J. Hirsh. N 
York: McGraw-Hill Book Co, Inc, 1952 a 
364. $6.00. s 
The treatment of the young delinquent. J. Ar 
. J. Arthi 
Hoyles. New York: The Philosophical Library, 
1952. Pp. 261. $4.75. á 
Some principals of construction of group intelli 
tests for adults. Husen and Henrickson. Stock. 
holm: Almquist and Wiksell, 1951. Pp. 98 2 
The Ames demonstrations in perception. William H 
Ittelson. Princeton: Princeton University Pri : 
1952. Pp. 81. $4.00. es 
Measurement in education, Arthur M. Jordan 
a 2 . New 
York: McGraw-Hill Book Co., In 
521. $5.25. pineg Bp: 
The Tree test. Charles Koch, New York: Grun 
and Stratton, Inc., 1952. Pp. 87. £ 
Evolution and human destiny. Fred Kohl 
z A z - New 
York: The Philosophical P 
tes phical Library, 1952, Pp. 118, 
Functional neuroanatomy. Wendell Kri 
York: The Blakiston Co., 1953. Pp. ae Py 
Psychological studies of human development "Ra 
mond G. Kuhlen and George G. Thompson. nes 
York: Appleton-Century-Crofts, Inc. 1952 Ee 
533. $3.50. y 2982: “Ep. 
A history of psychology in autobiograph: 
Y. Vi 
Herbert S. Langfeld et al., editors. vae 
Clark University Press, 1952. $7.50. cm 
How to talk with people: a guide for the i 
ment of communication in committees, come 
Lee. New York: Harper and Brothers, 1952 he 


152.47 
pi 


New 


Prescription jor rebellion. Robert Lindner. 
5 Pp. 305. 


York: Rinehart and Co., Inc., 1952. 
$3.50. 

Powers of the mind. Paul Maslow. 
Life Science. Brooklyn: Paul Maslow, 16 Court 
Street, 1952. Pp. 153. $3.50. 

Man’s search for himself. Rollo May. New York: 
W. W. Norton and Co., 1953. Pp. 277. $3.50. 
Finality and form. Warren S. McCulloch. Spring- 
field: Charles C Thomas, Publisher, 1952. Pp, 63. 

$3.75. 

The focussed interview. Robert K. Merton, Mar- 
jorie Fiske, and Patricia Kendall. New York: 
Bureau of Applied Social Research, Columbia Uni- 
versity, 1952. Pp. 202. $3.00. 

Social and psychological factors in opiate addiction. 
Alan S. Meyer, editor. New York: Bureau of 
Applied Social Research, Columbia University, 
1952. Pp. 169. $1.00, 

Ofice psychiatry. L. G. Moench, Chicago: Year 
Book Publishers, Inc., 1952. Pp. 299. $6.00. 

The philosophy of psychiatry. Harold Palmer. New 
York: Philosophical Library, 1952. Pp. 63. $2.75. 

The sensations, their functions, processes, and mecha- 
nisms. Henri Piéron, New Haven: Yale Univer- 
sity Press, 1952, Pp. 469. $6.00. 

The inmates. John Cowper Powys. New York: 
Philosophical Library, 1952, Pp. 318. $4.50. 

Advanced statistical methods in biometric research. 
C. Radhakrishna Rao. New York: John Wiley 
and Sons, Inc., 1952. Pp. 390. $7.50. 

The secret self. Theodor Reik. New York: Far- 
rar, Straus and Young, Inc., 1952, Pp. 329. $3.50. 

Volunteer work camp. Henry W. Riecken. Cam- 


bridge: Addison-Wesley Press, Inc., 1952. Pp. 
260. $3.60. 


Administering changes: 


Vol. II of The 


5 a case study of human rela- 

tions in a factory. Harriet O. Ronken and Paul 
R. Lawrence. Boston: Division of Research, Har- 
vard Business School, 1952. Pp. 324. $3.50. 

Medical public relations. Schuler, Mowitz, and 


Mayer. Ann Arbor: Edwards Brothers, Inc., 1952. 
Pp. 227, 


Fundamental co 
Wilson Shaffe 
York: McGra' 
493. $6.00, 

A practical 
New Yo 
$3.50, 


ncepts in clinical psychology. G. 
r and Richard S. Lazarus. New 
w-Hill Book Co., Inc., 1952, Pp. 


guide for troubled people. Lee R. Steiner, 
rk: Greenberg, Publisher, 1952, Pp. 299. 


New Books, Monographs, and Pamphlets 


Introduction to logical theory. P. F. Strawson. 
New York: John Wiley and Sons, Inc., 1952. Pp. 
263. $3.50. z 

Experimental diagnostics of drives. Med. L. Szondi. 
New York: Grune and Stratton, Inc., 1952. Pp. 
272. $13.50. 

Our common neurosis. 
Alfreda P. Sill. New 
1953. Pp. 210. $3.50, 

Introduction to testing. Arthur E. Traxler et al. 
New York: Harper and Brothers, 1952, Pp. 394. 
$5.00. 

Out of step. Joseph Trenaman. New York: The 
Philosophical Library, 1952, Pp. 217. $4.75. 


Charles B. Thompson and 
York: Exposition Press, 


Red Wing—five years later. Roland S. Vaile. ae 
neapolis; University of Minnesota Press, 1952. 
Pp. 27. Gratis. 


A further study of visual perception. M. D. Vernon. 
New York: Cambridge University Press, 1952- 
Pp. 263. $7.00. or 

Statistical tables and problems. (Third edition.) 
Albert Waugh. New York; McGraw-Hill Boo! 
Co., Inc., 1952. Pp. 242. $3.00. id 

Range of human capacities. Second edition. Davi 
Wechsler. Baltimore: The Williams and Wilkins 
Co., 1952. Pp. 190. $4.00. 

Contributions toward medical psychology. Arthur 
Weider, editor. New York: The Ronald Press 
Co., 1952. Pp, 885. $12.00. aaa 

Personnel interviewing. James D. Weinland a 
Margaret V, Gross. New York: Ronald Press 
Co., 1952. Pp. 416. $6.00. 

Improving undergraduate instruction in psychology. 
Dael Wolfle, et al. New York: The Macmillan 
Co., 1952. Pp. 60. $1.25. si 1l 

Personality and problems of adjustment. Kimbal 
Young. New York: Appleton-Century-Crofts, Ine» 
1952. Pp. 716. $5.00. ild. 

Helping parents understand the exceptional a 
Child Research Clinic. Langhorne, Pa.: The Woo! 
Schools, 1952. Pp. 42. Available on request, 

Selection, training, and use of personnel in indus no 
research, Proceedings of the Second Annual Co 1g 
ference on Industrial Research. New York: King 
Crown Press, 1952. Pp. 274. $4.50. 

Employee personnel practices in colleges and vd 
versities—1951-1952. Champaign: College Ee 
University Personnel Association, 1952. PP- 
$2.50. 


ial 


ti- 


| 
| 


— 


os 


Journal of Applied Psychology 


VoL. 37, No. 3 


JUNE, 1953 


The Measurement of Leadership Attitudes in Industry 


Edwin A. Fleishman * 


USAF Air Training Comm 


Recent years have seen an intensified con- 
cern in industry for the problems of leader- 
ship and human relations. Evidence of this 
can be seen in the increasing number of lead- 
ership training programs which have been in- 
stituted in various industries. However, those 
who train supervisors must still rely on a lim- 
ited number of general assumptions largely 
unsupported by either sound theory Or em- 
Pitical data, Part of this difficulty anses 
from the fact that effectual leadership de- 
pends to a great extent on the situation. Ad- 
ditional difficulties stem from the lack of 
adequate criteria of group effectiveness. A 
Pressing need is the development of depend- 
able research instruments which can be uti- 
ized to describe adequately the various cony 
plex socio-psychological aspects of a wide 
Variety of leader-group situations.” If these 
Were available, they might later be related to 
Criteria of group effectiveness in many specific 
Situations in which leaders function. he 
Present study was a further attempt to de- 
“op a number of such instruments which 
Would have application in industry. 


* This i es 
research was carried Ou Bae 
uss at the Personnel Researc d, Ohio Stat 
i EA as part of a larger Pi 
ar stey, with the cooperation 0 
er Com $ , en 
The Lackland Mit Force Base, San Antonie, pot 
are Pinions or conclusions contained in ages 
str those of the author. They are not to a te 
ae as reflecting the views OF jndorsemen' 
Partment of the Air Force. 
velop, ©, Personnel Research B 
leaqhment of such instruments @ 
1 age research program. & 
» Shartle (9), Seeman i 
4), Hemphill (5), Hemphill and Westie 
Urement (1, 2). Another Cogs 
Nelson a leadership attitudes 


f the 


153 


and, Human Resources Research Center ** 


In a previous paper (2) the writer has de- 
scribed a questionnaire found useful for the 
description of supervisory behavior. The 
present paper describes questionnaires which 
were developed for the measurement of lead- 
ership attitudes. 


Construction of the Questionnaire 


A preliminary 110-item Leadership Opin- 
ion Questionnaire was administered to 100 
foremen in a pilot study at the company’s 
Central School. These foremen represented 
17 different company plants. The foreman 
indicated for each item how frequently he 
thought he should do what each item de- 
scribed. He responded by marking one of 
five frequency alternatives which followed 
each item (e.g. always, often, occasionally, 
seldom, never). He was told that there were 
no right or wrong answers in the question- 
naire since “everyone’s work group is different 
and what is the best way to lead one group 
may not be the best way for another.” 

The items in this questionnaire were gen- 
erally parallel to those in the pre-test form of 
the Supervisory Behavior Description previ- 
ously described (2). However, in this latter 
questionnaire the items were worded in terms 
of “what does your own supervisor actually 
do” while in the present questionnaire items 
were worded in terms of “what skould you 
do.” The questionnaire was scored along two 
major and two minor dimensions.* Of the 


2These dimensions were originally iso à 
factor analysis of the items of a omits Eaa 
Description questionnaire administered to 300 Air 
Force crew members who described their airplane 
commander (4). Later analysis of the items, based 
on this industrial population, supported only ih 
two major factors “Consideration” and “Initiat; 3 
Structure” (2). ing 


154 Edwin A 
two major dimensions, one was called “Con- 
sideration,” which contained items reflecting 
the extent to which the supervisor is consid- 
erate of the feelings of those under him. It 
comes closest to representing the “human re- 
lations” approach toward group members. 
The second major dimension was called “Ini- 


tiating Structure,” and contained items re- ` 


flecting the extent to which the supervisor 
facilitates or defines group interactions to- 
ward goal attainment. He does this by plan- 
ning, communicating, scheduling, criticizing, 
initiating new ideas, etc. The two minor fac- 
tors were called, “Production Emphasis,” and 
“Social Sensitivity.” Response distributions 
were obtained for the alternatives to each of 
the items in the questionnaire. 

The corrected split-half reliability esti- 
mates for the two major keys “Consideration” 
and “Initiating Structure” were .69 and -73, 
respectively, and for the two minor keys 
“Production Emphasis” and “Social Sensi- 
tivity” the reliabilities were .36 and 33, re- 
spectively. In the light of the low reliabil- 
ities of the latter two keys and in view of the 
fact that a modified factor analysis of the 
items in the parallel Supervisory Behavior 
Description indicated that only the two ma- 
jor dimensions were meaningful in this indus- 
trial Population, the dimensions of “Produc- 
tion Emphasis” and “Social Sensitivity” were 
omitted from the revised form.* 

_ The criteria for selecting items for the re- 
vised form included: (1) the response dis- 
tributions of the items in the Leadership 


; and (2) the factor 


d 

Twent items were selected in this manner 

for the Consider ation key and 20 items 
Actually, in this anal 


i is it a e 
industrial sample, “inition o, êPPeared that in the 
duction Emphisi Initiating Structure” 


# and “Pro- 

onder sis Were reflections of xo 
eie dimension, as were “Consides the same un- 
cial Sensitivity.” ation” and “So_ 


. Fleishman 


PNE N ” 
were selected for the “Initiating Structure 
key. Examples of items in the revised “Con- 
sideration” key were: 


Help people in the work group with their 
personal problems. 

Back up what people under you do. i 

Speak in a manner not to be questioned 
(response weights reversed). 


Examples of items in the revised “Initiat- 
ing Structure” key were: 


Emphasize meeting of deadlines. , 

Assign people in the work group to particu- 
lar tasks. 

Meet with the work group at regularly 
scheduled times. 


Administration of the Revised Questionnaires 


i i i ire 

Various forms of the revised questionnai id 
were administered in one of the company 
plants. 


A total of 122 foremen filled out the fol- 


lowing forms with the indicated response 
“sets”; , ui 

(1) A Leadership Opinion Questions” 
How he thinks he should lead his own wor 
group. 

(2) A questionnaire entitled, “What Your 
Boss Expects of You”: A description of how 
the foreman feels his boss wants him to lea 
the work group. we 

A total of 394 workers filled out a question 
naire entitled, “How You Expect an taet 
Foreman to Act.” This is a description g 
worker expectations regarding leadership 
havior. ji 

Also, 60 supervisors above the rank of fa 
man filled out the following questionnaires: a 

(1) A Leadership Opinion Questionnaire 
How the boss thinks he should lead the f0” 
men under him, You 

(2) A questionnaire entitled, “What 
Expect of Your Foremen”: A description i 
the boss of how he wants his foremen to 1€ 
their workers. ae 

All these forms are variations of the Mea 
ership Opinion Questionnaire revised 0n the 
basis of the pilot study. All contained ain 
Same 40 items reworded slightly in ce 


A a 
forms to apply to the appropriate situatio” 
context, 


eae 


Measurement of Leadership Attitude: 


s in Industry 155 


Table 1 


Means, Range, Standard Deviations, 


Dimension Scores in Eac! 


Reliabilities, and Intercorrelations of 
h Revised Instrument 


Mean Reliability I z 
Instrument Dimension! Score S.D. Range? Estimate? copulation 

Filled out by foremen (N = 122) 
Leadership 
Opinion Consideration 53.9 7.2 36 to 74 .70 
Questionnaire Initiating Structure 53.3 78 34 to 69 79 — 01 
u 
Ae Your Boss Consideration 48.5 10.2 21 to 68 87 

'xpects of You” Initiating Structure 51.2 8.4 31 to 68 .78 05 
Fi 

loi out by workers (N = 394) 

How You Expect P 55 

an Ideal Foreman Consideration 57.0 5.5 41 to 70 .89 

to Act” Initiating Structure 4.2 39 26 to 58 88 i 
Fi 

em out by foreman’s boss (N = 60) 

of Segoe ining Consideration 30 13 38 to 67 64 

F 1 aie 
i oremen Initiating Structure 39 wI RwoF IS -31 
eadership Opini Er n 
aie Consideration 58.0 64 40 to 75 60 
Initiating Structure 52.4 7.6 31 to 69 82 —-23 


F . e å . 
ï en dimension key in each questionnaire contained 20 items. 
r ce alternatives to each item were weighted zero to four, the 


na i 
aire fon cach dimension. 
Split-half correlations corrected 


Results 


tie ee of the Questionnaires. On all 
‘6 cae tuments, the five alternative responses 
to f item were assigned weights from zero 
i our. Whether the high frequency alterna- 
ive (e.g., always) was weighted zero or four 
depended on the item’s orientation with re- 
Spect to the total dimension continuum. To- 
th dimension scores were derived by adding 

e weights corresponding tO the alternatives 


marked for the items in each dimension. Ta- 
o € 1 presents a summary of the means, range 
a standard deviations, ealistilne = 
aai i ach instru- 
ment, on intercorrelations for © 
ean striking feature of Table 1 is the inde- 
or dence of the two dimensions in each of the 
hie used, This is especially true when the 
Ms are filled out by workers and by fore- 


£ ei 
n. The correlations cluster about zero- 
men’s supervisors, 


relative to those 
h instruments an 
f significance- 


es . 

Ustial correlations are low 

6 ly obtained with such 
not reach the 1% level © 


highest possible score is 80 in each question- 


1 to full length of each dimension by the Spearman-Brown formula 


The important thing in interpreting the relia- 
bility coefficients is their magnitude relativ 
to the dimension intercorrelations. A) T 
ently, these instruments tap reliably il 
et disentoe of leadership sities, 
is is especially i i i ; § 
for item peared mn ie ei eee 
allel item in a Supervisory Behavior at 
tion questionnaire. An ideal but time ri 
suming procedure would have been to repeat 
the factor analysis on the attitude ae te 
the independence of dimensions seems to ha; 
been accomplished by the present rocedi 
At least it appears that the usual “halo” ef- 
fect, which often inflates the Intercomelation 
among keys in instruments of this type, has 
peen efficiently partialed out in the revi d 
form. The distributions of scores obtained 
from most of the questionnaires are rou, hi 
normal in shape. Eny 
The implication of these findi 
be that the dimensions of remi rE a 
and “Initiating Structure” are as meani ee 
and as independent in the attitudinal os 


156 Edwin 


Table 2 


Comparison of the Leadership Attitude Scores of 
Workers, Foremen, General Foremen, 
and Superintendents 


Level in the 


Dimension Organization 


Mean S.D. 


Superintendents 52.6 8.1 


(N = 13) 
General Foremen 53.2 7.1 
“ eee) (N = 30) 
Consideration Kerema 53.9 72 
(N = 122) 
Workers 57:0 55 
(N = 394) 


Superintendents 55.5 5.7 


(N = 13) 
General Foremen 53.6 6.9 
“Initiath » (N = 30) 
Initiating Structure ocenies 533 78 
(N = 122) 
Workers 44.2! 3.9 
(N = 394) 


' Indicates this mean differs significantly (beyond the 


-O1 level of confidence) from the mean of the foremen 
group. 


of leadership as in the behavioral realm. It 
thus appears that supervisors may be high in 
the amount of consideration they feel should 
be shown their subordinates, but at the same 
time may be either low or high in the amount 
of planning, criticizing, pushing for produc- 
tion, and general “structuring” behavior that 
they feel they should engage in. There is also 
the indication that workers who want a great 
deal of “consideration” in their foremen do 
not necessarily want less “structuring” or 
more “structuring” of their work activities 
from him, 

_ Attitudes at Different Levels. 
tionnaires entitled “What You 
Your Foremen” (filled out by supervisors), 


Leadership Opinion Questi i 

Onnaire (filled out 
by foremen), and “How You Sener an Ideal 
Foreman to Act” (filled out 


by workers) all 


A. Fleishman 


The comparison shows that the attitudes of 
the foreman group are much more like the at- 
titudes of superiors than they are like the 
attitudes of the workers. Differences be- 
tween the mean scores of the foremen and 
their bosses are not statistically significant. 
Differences between the scores of the foremen 
and those of the workers are highly signifi- 
cant. This is true of scores on both leader- 
ship dimensions. The workers prefer more 
“consideration” and less “structure.” * It 
also appears that the higher people were m 
the plant hierarchy, the less “consideration 
they felt the workers should get. Moreover, 
the higher the level, the more “structuring 
the people felt should be initiated with the 
work group. However, some of these differ- 
ences were not large or significant although 
consistent, The tendency was for the fore- 
men’s attitudes to fall somewhere between 
what the workers expect and what their su- 
pervisors expect. , 1 

Table 2 also indicates the relatively mg? 
standard deviations of the scores made y 
workers on both dimensions of the form a 
You Expect an Ideal Foreman to Act. ‘a 
each dimension the differences between for 
sigmas of worker attitude scores and that s% 
supervisor attitude scores are statistically en 
nificant (P < .01). It appears that on 
workers are more homogeneous with agen 
to their leadership expectations than are hae 
supervisory groups with respect to how a 
feel groups should be led. However, ae 
Scores present little evidence revealing ; 
existence of an “ideal leadership” stereotyP” 
among workers since there was still a pa 
siderable range of scores on both expec at 
“consideration” and expected “structure 
the worker level (see Table 1). der- 

It was also possible to compare the leat an 
ship attitudes of supervisors above the ae 
of foreman toward foremen and workers: 
This comparison is between scores made | ñ 
these supervisors on the Leadership Opinio" 
Questionnaire and the form “What You a 
Pect of Your Foremen.” This comparison 
Presented in Table 3. 


ke 

4 It is interesting to note that although the Wo er 
lec ṣe generally on a piece rate basis, they Fi of 
less “structuring,” which consists in large par 
oremen activities pushing for production. 


Measurement of Leadership Attitudes in Industry 157 


Table 3 


Comparison of the Leadership Attitudes of Foremen’s Supervisors Toward 
Workers and Foremen (N = 60 Supervisors) 


Leadership Attitudes 


Toward Workers 


Toward Foremen 


Dimension Mean S.D. Mean S.D. ne 
“Consideration” 53.0 %3 58.0 6.4 P<.0ol 58 
“Initiating Structure” 54.0 6.7 52.4 7.6 P> 05 73 


It can be seen that supervisors above the 
rank of foreman scored significantly higher 
on their “Consideration” attitudes toward the 
foremen than in their “Consideration” atti- 
tudes toward workers. However, the differ- 
ence in the amount of “Structuring” they felt 
should be initiated toward each group is not 
statistically significant. Moreover, bosses 
who scored high in their “Consideration” at- 
titudes toward foremen tended also to score 
higher on these attitudes toward workers 
(r= .58). This was also true for “Initiat- 
ing Structure” attitudes toward foremen and 
worker groups (r = .73). 

Differences between Work Groups in Their 
Leadership Expectations. An analysis of 
variance was made of the scores derived from 
226 workers, drawn from 73 different work 
groups on the questionnaire “How You Ex- 
pect an Ideal Foreman to Act.” This analy- 
sis revealed significant differences between 
work groups relative to that within work 
groups (F = 14.7, P < 01) in how “consid- 
erate” they expect an ideal foreman to be. 
Apparently, worker attitudes concerning the 
amount of “consideration” desired depends to 
a large extent on the particular work groups. 

owever, differences between work groups in 
how much “structuring” behavior they felt 
the foremen should engage in were not signifi- 
cant. This may be due to the small variation 
in scores on this dimension for the total sam- 
ple of workers (sigma = 3.9, see Table 2). 

Relationships with Labor Grievance Rates. 
A problem of future research with these 1m- 
struments is a well-controlled criterion study 
relating these measures to various criteria of 
Sroup effectiveness in a variety of leadership- 
Btoup situations in industry. The independ- 
ence of dimension scores has special relevance 


here since each may be differentially related 
to such criteria, depending on the situation. 
Although such a criterion study was beyond 
the scope of the present investigation, corre- 
lations were obtained between some of the 
questionnaires and labor grievance rates in 
23 departments over an eight-month period. 
In this limited study only one correlation 
reached the 1% level of significance (based 
on an N of 23 departments). This was the 
correlation of — .53 between the mean scores 
of foremen in each department on the “Con- 
sideration” dimension of the form “What 
Your Boss Expects of You.” The correlation 
with the “Initiating Structure” score of this 
form was .32. The trend was for depart- 
ments with high worker grievance rates to 
be those whose foremen perceived their own 
supervisors as expecting them to lead with a 
low degree of “consideration” and a high de- 
gree of “structuring.” These results, of 
course, are purely suggestive. An adequate 
evaluation of the value of these instruments 
in predicting group effectiveness must await 
additional research. 

The Leadership Opinion Questionnaire has 
been found of value in the evaluation of a 
leadership training course for foremen and 
in the study of certain social factors affecting 
the foreman’s leadership role (1, 3). 


Summary 


The development of questionnaires to meas- 
ure certain aspects of leadership attitudes in 
industry has been described. The question- 
naires were designed to measure two relatively 
independent dimensions of leadership atti- 
tudes. These dimensions were called “Con- 
sideration” and “Initiating Structure.” Vari- 
ous forms of the questionnaires, revised on 


158 Edwin A 


the basis of a pilot study, were administered 
at various levels in the industrial hierarchy. 
On each questionnaire, the dimensions were 
shown to have sufficient reliabilities, insig- 
nificant intercorrelations with each other, and 
adequate distributions. 

A comparison of the leadership attitude 
scores at four plant levels revealed that the 
higher people were in the plant hierarchy, the 
less “Consideration” they felt the workers 
should get and the more “Structuring” they 
felt should be initiated. The attitudes of the 
foreman group on each dimension fell some- 
where between what the workers expect and 
what their own supervisors expect, but were 
much more like the attitudes of their super- 
visors. 

A comparison also was made between the 
attitudes of the supervisors of foremen toward 
leading foremen and toward leading workers. 
The results showed that these supervisors 
scored significantly higher in the amount of 
“Consideration” they felt should be shown 
the foremen relative to that shown to work- 
ers, but no significant differences in their 
“Structuring” attitudes toward each group. 
High correlations were found between these 
attitudes of supervisors toward foremen and 
toward workers on both dimensions. 

With reference to the workers’ attitudes 
concerning the amount of “Consideration” 
they would like in an “ideal foreman,” the 
results indicate this depends to a large ex- 
tent on the particular work group. There 
were significant differences between work 
groups relative to that 
the amount of “Consi 


insignificant differences with respect to the 
desired. The work- 


. Fleishman 


lower degree of consideration and a higher 
degree of structuring. 

It should be stressed that the findings re- 
ported here are regarded as specific to the 
particular plant and the groups of workers 
and supervisors studied. Additional research 
is needed before valid generalizations can be 
made. It is possible that future research will 
indicate that combinations of measures of 
such things as group characteristics, needs 
and expectations, leadership attitudes, behav- 
iors and perceptions, pressures from super- 
visors, etc. can yield more successful predic- 
tions where ordinary testing procedures have 
failed in the complex field of leadership and 
group effectiveness, 


Received August 4, 1952. 


References 


1, Fleishman, E. A. Leadership climate and super- 
visory behavior. Columbus, Ohio: Personnel 
Research Board, Ohio State University, 1951. 

2. Fleishman, E. A. The description of supervisory 
behavior. J. appl. Psychol., 1953, 36, 1-6. 

3. Fleishman, E, A. The leadership role of the fore- 
man in industry. Engineering Experiment Sta- 
tion News, Ohio State University, 1952, 24. 

4. Halpin, A. W., and Winer, B. J. Studies in air- 
crew composition II: The leadership behav- 
ior of the airplane commander. HRRL Con- 
tract, Technical Report No, 3, Personnel Re- 
search Board, Ohio State University, May 
1952. n 

5. Hemphill, J. K. Leader behavior escts pier 
Columbus, Ohio: Personnel Research Board, 
Ohio State University, 1950, 4 

6. Hemphill, J. K., and Westie, C. M. The is 
urement of group dimensions. J, Psychols 
1950, 29, 325-342, dj 

7. Nelson, C. W. The development and evaluatio 
of a leadership scale for foremen. Ph.D. 
Thesis, 1949, University of Chicago. 5 

8. Seeman, M. A status factor approach to Doka 
ship. Columbus, Ohio: Personnel Resear¢ 
Board, Ohio State University, 1950. 2 

9. Shartle, C. L. Leadership and executive pet 
formance, Personnel, 1949, 25, 370-380. 7 

10. Stogdill, R. M., and Shartle, C. L. Methods fo 
determining patterns of leadership behavior in 
relation to organizational structure and objec- 
tives. J. appl. Psychol, 1948, 32, 286-291. 


THE JOURNAL or APPLIED PSYCHOLOGY 
Vol. 37, No. 3, 1953 


Productivity and Attitude Toward Supervisor * 


C. H. Lawshe and Bryant F. Nagle 


Occupational Research Center, Purdue University 


Of utmost importance in the study of hu- 
man behavior are the factors which motivate 
individuals. Inquiries into the motivations 
of people and the relations of these motiva- 
tions to' performance are being initiated and 
expanded throughout the field of psychology- 
This is especially true in industrial psy- 
chology. 

In the industrial situation psychologists are 
asking, what motivates employees toward pro- 
ductive effort? How do the financial rewards 
of work, the behavior of the supervisor, the 
nature of the job itself, and the goals of man- 
agement affect the effort of employees (4)? 
l'he most common approaches to these ques- 
tions have been through the use of question- 
naires, interviews and projective techniques 
to determine the attitudes of employees. 
Psychologists are seeking to relate employee 
attitudes to the actual industrial practices 
of paying for services, of supervising, of 
Performing the job, and of setting goals that 
will result in the highest productivity. Im- 
Plicit in this approach is the assumption that 
Productivity is related to employee attitude. 

his assumption is accepted by nearly every- 
One, yet little experimental evidence has been 
Presented. This paper is a report on the re- 
ationship between employee attitudes and 
Productivity. 


Subjects 


The population used in thi 


Of the offi in 
ce force of a large ! 
and is divided into a number of departments. 


ince some of the departments in the plant 
ate small and since some did not pee 
in all phases of the study, this report is base 


*For the past three years the Purdue Rae 
foundation and Louisville Works of tS ti in 
ional arvester Company have been cooperating y 
Personne} research. This report is conc rela- 
ioy one phase of a larger study involving loyee 

ships between work group productivity, eae ee 
attitudes, and supervisory sensitivity B “wil z - 
attitudes’ A commlete report of the study W! p 
ĉar later in the literature. 


s study is part 
dustrial plant 


on only 14 work groups. Of the 223 non- 
managerial, salaried employees in these 14 
work groups, 208, or 93%, participated in the 
portion reported here. 


Productivity Criterion 


The Rating Procedure. Since each of the 
14 work groups was engaged in a different 
type of activity it was impossible to get com- 
parable objective measures of output. In- 
stead, a paired comparison rating of produc- 
tivity was used. Six executives in the plant 
(1. Works Manager, 2. Training Director, 3. 
Staff Assistant to Works Auditor, 4. Staff 
Assistant to Works Manager, 5. Assistant 
Works Auditor, 6. Works Auditor) were asked 
to indicate those work groups which they felt 
capable of rating. The range of selections 
was from eight to 14. The executives were 
supplied with paired comparison forms and 
instructed to indicate “. . . The department 
in each pair which is, in your opinion, doing 
its job better.” l 

Each executive’s ratings were converted to 
standard scores as suggested by Lawshe, Kep- 
hart and McCormick (6). The standard 
scores for each work group as given by the 
odd numbered raters were averaged and cor- 
related with the mean standard scores of the 
even numbered raters. The resulting coeffi- 
cient of .78 was stepped up by the Spearman- 
Brown formula to estimate the reliability of 
the means of all six raters, yielding an 7 of 
.88. 

Measuring Productivity. Previous at- 
tempts of researchers to measure the produc- 
tivity of work groups have generally followed 
two lines. 

1. One approach (4) has sought to find 
situations in which there are comparable work 
groups, groups performing the same kind of 
work under the same conditions. Measures 
of productivity for the various work groups 
can be directly compared, since each ie 
is doing the same job with the same eqip. 


159 


160 


ment. In the best known application of this 
approach (4) the productivity measures for 
the various groups had little variability, and 
the resulting relationships with measured at- 
titudes were small. While this approach to 
group productivity has many advantages, it 
is limited by the rarity with which one finds 
a number of work groups comparable in size, 
work performed, physical working conditions, 
equipment, and financial rewards. 

2. Another approach (2) has sought to 
determine how well each work group meets its 
output quota. A “normal” level of output is 
set for each work group, and the group’s 
relative productivity is represented by the 
actual output divided by the normal level 
prescribed. This approach is limited not only 
by the validity of the output levels prescribed, 
but also by the fact that the productivity of 
many work groups, especially in an office 
situation, can rarely be measured in physical 
units turned out, either because it is not the 
job of the particular group to process so many 
of this or that, or because the work output is 
regulated by the activities of other depart- 
ments. 

Rating Limitations. The use of a rating 
approach to work group productivity as util- 
ized in this study also has its limitations. It 
has the prime limitation of any rating system; 
one does not know for sure what the raters 
really had in mind when they rated. In this 
case an effort was made to stress the relative 
performance of the work groups. Verbal as- 
Sociation of a supervisor with his particular 
work group was avoided in an effort to mini- 
mize the influence of su 
in the ratings, 
that the raters 


How little 
whether or 


tions, and similar Ci 
to have been the 
ecutives’ ratings, 


onsiderations are believed 
Prime factors in the ex- 


Attitude Toward the Supervisor 
The Questionnaire N 
: early all 
Ployees of the plant filled out a i 


C. H. Lawshe and Bryant F. Nagle 


attitude questionnaire. The questionnaire in- 
cluded 21 items about the individual em- 
ployee’s immediate supervisor so as to provide, 
to some extent, a diagnostic view of the 
supervisor as well as a total score representing 
the employee’s opinion of the supervisor. 
Item Selection Procedure. Using a primary 
group composed of 50% of the participating 
employees, the 21 questions were scored by 
giving a weight of 1 to the most favorable 
response and 0 to the other one, two or three 
responses. Total scores were computed for 
each employee. On the basis of total score 
the primary group was divided into a high- 
scoring half and a low-scoring half. Then the 
per cent of the high-scoring half giving the 
most favorable response to each of the 2! 
items was computed. The significance of the 
difference between these two per cents was 
computed for each item by means of the 
Lawshe-Baker nomograph (5). Two items 
were discarded by this process. All remaining 
items on the survey were also processed in the 
manner. Three new items were added to the 
19, making a total of 22 items measuring em 
ployee attitude toward the supervisor. — 
Scale Reliability. Each questionnaire mn 
the holdout group was scored on the 22 items. 
Separate total scores for each employee were 
computed for the 11 odd numbered items a 
for the 11 even numbered items, and these 
were correlated. The resulting coefficient © 
865, when stepped up for 22 items, yielded 3 
scale reliability of .92. , 
Individual employee scores on this i 
ranged from 0 to 22. This score may, a 
easily interpreted as the number of questi 
which the employee answered in the yes 
favorable manner. Average scores for @ 


scale 


and ranged from 8.8 to 19.3. 
Score of the 14 work groups toward < ja- 
supervisor was 13.9 and the standard dev 
tion 3.2. t of 

The Attitude Dimension. The conte? nd- 
the 22 items is important to an understa e 
ing of what was being measured by the ‘ the 
The questions covered many aspects ° em 
supervisor’s behavior as perceived by tHe he: 
Ployees, including such things as, on 
give you straight answers, avoid you W 


X 


Productivity and Attitude Toward Supervisor 161 


knows you want to see him about a problem, 
criticize you for happenings over which you 
have no control, delay in taking care of your 
complaints, keep you informed, give you 
recognition, show interest in your ideas, 
follow through on his promises, explain to 
you the “why” of an error to prevent recur- 
rence, give you sufficient explanation -of why 
a work change is necessary, etc. 


Results 


Correlation. The average rating of each 
work group on how well it was doing its job 
Was correlated with the average attitude score 
in the work group toward the supervisor. 
The Pearson coefficient was .86. With 12 
degrees of freedom, a correlation of .661 is 
Significantly different from zero at the 1% 
level of confidence. Figure 1 provides a visual 
indication of the relationship between the 
two variables as well as an indication of the 
8eneral dispersion of each variable. 

Interpretation. In the interpretation of 
this very high relationship between rated 
Productivity and employee attitude toward 
Supervisor, caution must be exercised. It is 
all too easy to fall into the error of cause and 
effect thinking. On the basis of this study it 


70 
60 
50 


40 


RATED PRODUCTIVITY OF DEPARTMENT 


AVERAGE EMPLOYEE ATTITUDE 
TOWARD SUPERVISOR ; 
Fro. 1, Scatter diagram showing peng os 
a Productivity of departments as rate br 
sugcütives and average employee attitude dines 
i Pervisor, The correlation between the va 
S .86, 


can be concluded only that the behavior of 
the supervisor, as perceived by the employees, 
is highly related to the productivity of the 
group as perceived by higher management. 

The literature has long been replete with 
statements as to the influence of the super- 
visor on group output. French says, “Lead- 
ership has long been regarded as the most 
important factor in group effectiveness . . .” 
(1, p. 475). He points out that, “Since the 
manipulation of (or allowance for) variables 
related to morale is in institutional groups 
primarily the responsibility of appointed 
leaders, the factor of leadership assumes cen- 
tral significance” (1, p. 485). If the basic 
assumption that the attitudes of people exert 
a great influence over their performance is 
true, then it follows that the leader has within 
his power a means by which he can influence 
the output of the group. 

Results similar to those reported here have 
been obtained in the Prudential study. 
Katz (3) lists a number of variables in super- 
visory behavior which were related to the 
productivity of the work group. It was found 
that supervisors of high productivity groups 
placed less direct emphasis on production as 
the goal, encouraged worker participation in 
making decisions, were more employee 
centered, and spent more time in supervision 
and less in production work, The only em- 
ployee attitude in the study positively related 
to productivity was pride in the work group. 
However, employee attitude toward super- 
visor was not mentioned in the report. In 
view of the types of supervisory behavior 
which were found to be related to produc- 
tivity, it might be inferred, however, that 
employee attitude toward this supervisory 
behavior would also have been so related. 


Summary 


A measure of relative productivity of work 
groups in doing their jobs was obtained by 
having six executives rate the work groups 
by the paired comparison system. An atti- 
tude questionnaire was administered to the 
employees of these work groups. From this 
questionnaire 22 items were used to measure 
employee attitude toward the immediate 
supervisor of the work group. The correla- 


162 


tion between the executives’ rating of the 
productivity of the work groups and the 
employee’s attitude toward the supervisor was 
86. This relationship substantiates the 
hypothesis that the supervisor’s behavior, as 
perceived by the employees, is highly related 
to the output of the work group. 


Received April 14, 1953. 
Early publication. 


References 


1. French, R. L. Morale and leadership. In Na- 
tional Research Council, Committee on Under- 
sea Warfare, Panel on Psychology and Physi- 
ology, A survey report on human factors in 
undersea warfare. Washington: National Re- 
search Council, 1949. Pp. 463-490. 


C. H. Lawshe and Bryant F. Nagle 


2. Giese, W. J., and Ruter, H. W. An objective 
analysis of morale. J. appl. Psychol., 1949, 
33, 421-427. 

. Katz, D. Morale and motivation in industry. 
In W. Dennis (Ed.), Current trends in indus- 
trial psychology. Pittsburgh, Pa.: University 
of Pittsburgh Press, 1949. 

4. Katz, D., Maccoby, M., and Morse, Nancy C. 
Productivity, supervision and morale in an 
office situation. Part I. Ann Arbor, Mich.: 
Institute for Social Research, University of 
Michigan, 1950. 

5. Lawshe, C. H., and Baker, P. C. Three aids in 
the evaluation of the significance of the dif- 
ference between percentages. Educ. psychol. 
Measmt., 1950, 10, 263-270. 

6. Lawshe, C. H., Kephart, N. C., and McCormick, 
E. J. The paired comparison technique for 
rating performance of industrial employees. 
J. appl. Psychol., 1949, 33, 69-77. 


w 


THe Journat or Appiirp PsycHoLocy 
Vol. 37, No. 3, 1953 


A Simplified Procedure for the Measurement of Employee 
Attitudes 


Melany E. Baehr 


Industrial Relations Center, University of Chicago 


During the past several years, the Indus- 
trial Relations Center of the University of 
Chicago has been engaged in the development 
of the SRA Employee Inventory, an instru- 
ment for assessing employee attitudes.1 The 
inventory was developed by a coordinated re- 
Search team representing the fields of psy- 
chology, sociology, business, and economics. 
Tt was the intent of the research team to con- 
struct an inventory which would yield a pro- 
file of scores to reflect the attitude of any 
given group of employees toward the signifi- 
cant factors in the work situation. In addi- 
tion, the inventory was to be so constructed 
that its administration and scoring and the 
interpretation of results could be accom- 
Plished with the minimum expenditure of 
time and effort. i 

uring the developmental stage of the in- 
ventory, the writer investigated experimen- 
tally several problems in test construction. 
his investigation dealt with the way in 
Which the profile of scores was affected by: 
(1) the arrangement of the items (random- 
ized vs, categorized items); (2) the number 
of scale intervals (five-point vs. three-point 
Scales); and (3) the scoring procedure (un- 
Weighted vs, weighted responses). 

The effect on the profile of scores was in- 
Vestigated separately for each of these 

itions, In addition, a comparison was mace 
Of the profiles of scores resulting from six 
Possible combinations of item arrangement, 
number of scale intervals, and scoring pro- 
cedure, ‘These six combinations represent 
Procedures of an increasing degree of com- 
Plexity in the treatment of the employee re- 
SPonses, The object was to identify r hout 
Plest Procedure which could be used withou 


°SS of information. 
* Published by Science Research Associates, Inc., 
Chicago 


The Problem 


Randomized vs. Categorized Items. In an 
inventory or test in which groups of items are 
combined to yield sub-test (category) scores, 
the items may be presented either in random 
order or grouped under the category headings 
to which they belong. The question arises 
as to whether or not the grouping of items 
will affect the profiles of category scores. In 
other words, do the test items yield different 
profiles of category scores when they are 
grouped together or categorized in the in- 
ventory than when they are randomized 
throughout the inventory? 

Five-Point vs. Three-Point Scale. The 
number of intervals which can be used effec- 
tively in any inventory or test is a function 
of such conditions as the degree to which the 
attribute to be rated can be objectively de- 
fined, the degree of skill possessed by the 
raters in the use of rating scales, and their 
interest in making the ratings. The question 
arises as to whether or not the use of a five- 
point scale would yield a profile of category 
scores which was different from that obtained 
with the three-point scale. 

Unweighted vs. Weighted Responses. When 
the five-point scale is used, the further ques- 
tion arises as to whether or not the profile of 
category scores will be affected if the re- 
sponses in the extreme scale intervals are 
given twice the weight of the responses in the 
scale intervals immediately preceding them. 

The General Problem. Six Procedures for 
the treatment of employee responses result 
from the combination of the three conditions 
discussed above. These are as follows: (1) 
a three-point scale with categorized items; 
(2) a three-point scale with randomized 
items; (3) an unweighted five-point scale 
with categorized items; (4) an unweighted 
five-point scale with randomized items; (5) 


a weighted five-point scale with categorized 


163 


164 


items; and (6) a weighted five-point scale 
with randomized items. 

The hand scoring of an inventory utilizing 
a three-point scale and categorized items need 
involve only a count of the number of ac- 
ceptable responses in groups of consecutive 
items. Under these conditions the profile of 
category scores is immediately available. It 
is self-evident that an inventory composed of 
randomized items or one in which responses 
to items must be made in terms of a scale 
having a larger number of intervals, espe- 
cially if the extreme intervals are weighted, 
would require a proportionately greater 
amount of administration and scoring time. 
The general problem, therefore, is to deter- 
mine whether or not the simplest procedure 
(use of the three-point scale with categorized 
items) yields a profile of category scores 
which is different from the profile yielded by 
the more complicated procedures. S 


The Experimental Design 


A total of 64 items, consisting of state- 
ments descriptive of the work situation, were 
selected for inclusion in a preliminary form 
of the SRA Employee Inventory. These were 
grouped under the following general cate- 
gories: I. Job Demands; II. Working Condi- 
tions; III. Pay; IV. Company Benefits; V. 
Changes on the Job; VI. Friendliness of Fel- 
low Employees; VII. Supervisory Effective- 
ness; VIII. Management and Company Pol- 
icy; IX. Communication: and X. Personal 
Satisfaction on the Job. 

Four forms of the inven 
structed as follows: 

1. Randomized items to which responses 
were to be made on a three-point scale. 

2. Randomized items to which responses 
were to be made on a five-point scale. 

3. Categorized items to which responses 
were to be made on a three-point scale, 


4. Categorized items 


to which responses 
were to be made on a five-point ea 
Each of the four forms of the inventory 
was administered to a Separate group of em- 
Ployees at a retail store 


loy ; of a large merchan- 
dizing organization in Chicago. These groups 


tory were con- 


Melany E. Baehr 


of employees were approximately equal and 
were randomly selected from a total experi- 
mental population of 454 subjects. ; 

For the two forms in which the three-point 
scale was used, the employee was required to 
indicate whether he agreed, was undecided, 
or disagreed with each statement (i.e. 
ventory item). About half the items in ne 
category were company oriented, and aa 
anti-company oriented. If an _employe 
agreed with a company-oriented item, en 
“Management here is really interested in i 
welfare of employees,” it was regarded as a 
“Favorable” response (i.e., favorably n 
toward the company). If he disagreed Send 
such an item, it was regarded as an od 
vorable” response. The converse held T 
respect to the items that were anti-comp 
oriented. 3 

Essentially the same procedure was a 
lowed also for the two forms in ee 
five-point scale was used. However; io 5 
the five-point scale provided the my gi 
with the opportunity to indicate, if i i 
chose, that he strongly agreed or eae | 
disagreed with an item, there were two 4 > 
tional types of response. These were. a 
garded as indicating a “Very Taviiab ea 
a “Very Unfavorable” orientation towar 
company. 

Results 


The inventory yields a profile of ten spe 
gory scores. Each category score is the oup 
cent of favorable responses made by the 8" de 
to the items in the category. It is regar t0# 
as a measure of the positive feeling held er 
ward the company by the group. The Pa 
cent of favorable responses rather than a 
number of favorable responses was used te- 
cause the number of items varies from ae 
gory to category. The specific formulae ge 
ployed in the calculation of the per ‘of 
favorable response (P.F.R.) are given be 


7 an 
Three-Point Scale (Categorized ot = 
domized Items) — 


per — LOE, 
aS TO I 


Simplified Procedure for Measurement of Employee Attitudes 


Unweighted Five-Point Scale (Categorized 
or Randomized Items) — 
100(F + VF) 
om > = —— 
BER = a 
Weighted Five-Point Scale (Categorized or 
Randomized Items) — 


100(F + 2VF) 
PPR = moe. 


P.F.R. is the per cent favorable response, 

F is the number of “Favorable” responses, 

VF is the number of “Very Favorable” re- 
sponses, 

N is the number of persons in the group, 
and 

I is the number of items in the category. 


The profiles of the per cent favorable re- 
Sponse which result from the six procedures in 
the present investigation are shown in Figure 
1, where the following identifying symbols 

ave been used: 


A. Unweighted five-point scale with ran- 
domized items. , 

A’. Unweighted five-point scale with cate- 
gorized items. : 

B. Weighted five-point scale with random- 


ized items. s 
P’, Weighted five-point scale with catego- 


rized items. 


165 


C’. Three-point scale with 


items. 


categorized 


It can be seen by inspection of Figure 1 
that all the profiles exhibit a great similar- 
ity in shape, though the profiles in which 
weighted responses were used occur at a lower 
level on the scale. 

A quantitative measure of the similarity 
between the profiles was obtained by calcu- 
lating the product-moment correlation coeffi- 
cients between the sets of category scores. A 
comparison of the profiles obtained from ran- 
domized and from categorized items, when 
the possible effects of the number of scale 
intervals and the method of scoring are con- 
stant, is obtained by comparing profile A with 
A’, B with B', and C with C’. A comparison 
of the profiles obtained from the three-point 
and the five-point scales, when the possible 
effects of the order of appearance of the items 
are constant, is obtained by comparing profile 
A with C, A’ with C’, B with C, and B’ with 
C'. A comparison of the profiles obtained 
from the five-point weighted scale and the 
five-point unweighted scale, when the possible 
effects of the order of appearance of the items 
are constant, is obtained by comparing profile 
A with B, and A’ with B’. For the sake of 
completeness, the other possible profile com- 
parisons were also made, i.e., A with B’, A’ 
with B, A with C’, A’ with C, B with C’, and 


; j m ndomized B’ with C. The results for these four sets of 
A en eo es comparisons are given in Table 1. 
items, 
Table 1 


Product-Moment Cor! n 
Obtained from the Di 


; r tegory Scores in the Six Profiles 
i cients Between the Ten Ca 
relaten Cee prasecares in the Treatment of Employee Responses 


AA’ 


es Compared 
Profiles Com] 86 


p 


Profiles Compared 
r 


97 


Randomized vs. Categorized Items 
BB’ 


cc’ 
97 


Five-Point vs. Three-Point Scale 
AC Arce BC 


BC’ 
98 98 


Unweighted vs. Weighted Responses 


AB A'B’ 


Profiles Compared 5 
r 


Pi red j 
Profiles Compa 05 


Other Possible Comparisons 
AB’ A'B 


AC AC 
95 97 94 08 


r 


166 


O 10 20 30 40 50 60 70 80 90 
Per Cent Favorable Response 


A. Five-Point Scale Randomized Items (nweighted) 


O 10 20 30 40 50 60 70 80 90 
Per Cent Favorable Response 


B. Five-Point Scale Randomized Items (Weighted) 


O 10 2 30 40 5 


O 60 70 80 90 
Per Cent Favorable Response 


C. Three-Point Scale Randomized Trems 
Fic, 1. 
Employee 
treatment of the employee responses, 
The high correlati 
indicate that the 


gory scores contributing to t 
A’, B, B’, C, and C’ were also Compared with 
respect to their variances. 


Profiles showing the per cent fay 


h orable response ti 
Inventory. There is one Profile fo: 


T each of the six 


Melany E. Baehr 


O 
40_50 60 70 80 9 
RIG Per Cont Favorable Response 


A’, Five-Point Scale Categorized Items Unweighted) 


o io 30 40 5O 60 70 80 90 
ar Cent Favorable Response 


B! Fivefoint Scale Categorized Ttems (weighted) 


o 10 20 30 40 50 60 70 80 90 


a Per Cent Favorable Response 
~ Three-Po} 


int Scale Gtegorized Items 


© the ten categories in the 
Procedures employed in the 


. es z 
Bartlett’s test of homogeneity of varianc 
gave a chi- 


o 
Square of 1.291 with 5 degrees 
freedom, yielding a P value of .94. m 
he results indicate, therefore, that aie 
Profiles are highly similar with respect ta m 
their shape and the variability of their €% 


. mes 
* Snedecor, George W. Statistical methods. He i 
Iowa: The Iowa State College Press, 1946, p- 


=— 


Simplified Procedure for Measurement of Employee Attitudes 


gory scores. The profiles obtained from the 
weighted five-point scale are at a different 
level, but can readily be converted to the 
same level as the remaining profiles by apply- 
ing a constant stretching factor to the cate- 
gory scores, 

The high correlations between the profiles 
indicate that only minor variations occur in 
the individual category scores. Such minor 
Variations would not affect the interpretation 
of the profile as a whole. 


Summary 


A comparison was made of the profiles of 
Category» scores obtained from six different 
Procedures in the treatment of the responses 
to items in an inventory designed to reflect 
the attitude of industrial employees toward 
the Significant aspects of the work situation. 

hese six procedures represented progres- 
Sively increasing complexity in the arrange- 
ment of the items, the rating scales employed, 
and in the method of scoring the responses. 

Comparison of the relevant profiles showed 
that: 

(1) Almost identical profiles were obtained 

from randomized and from categorized 
items . 
(2) Almost identical profiles were obtained 
from five-point and from three-point 
scales, and : 

(3) Amod identical profiles were obtained 

from unweighted and from weighted 


responses. 


167 


Finally, all possible comparisons were made 
between the six profiles studied in this investi- 
gation. The 15 product-moment correlation 
coefficients which resulted ranged from + .94 
to + .98 (N = 10). This indicated that the 
profiles are highly similar in shape. Applica- 
tion of Bartlett’s test of homogeneity of vari- 
ances indicated that the profiles were similar 
with respect to the variability of their cate- 
gory scores. It can be concluded, therefore, 
that the use of the simplest procedure, i.e., 
the three-point scale with categorized items, 
results in a profile of scores which would be 
interpreted in exactly the same way as those 
resulting from the other five, more compli- 
cated procedures. ` 

It is clear that the use of the simplified 
procedure will result in considerable savings 
in time, labor, and costs involved in the ad- 
ministration and scoring of the inventory, 
From a practical standpoint, therefore, this 
investigation points up the desirability of 
running pilot studies to determine whether 
or not a simpler form of a test or inventory 
will yield any less information than more 
complicated ones for specific subject popu- 
lations. This is especially true when, as is 
often the case in industrial and educational 
institutions, an inventory which is once ac- 
cepted is likely to be routinely administered 
to thousands of cases year by year. 


Received July 14, 1952. 


Tur JOURNAL OF APPLIED PsycHoLocy 
Vol. 37, No. 3, 1953 


The Motivation Factor in Testing Supervisors 


Eugene Emerson Jennings 


Wharton School of Finance and Commerce, University oj Pennsylvania 


Effectively using psychological testing to 
aid in selecting supervisory personnel presents 
an extremely important problem in motiva- 
tion. The question is whether there are dif- 
ferences in motivation in taking tests for re- 
search or for actual promotion purposes. If 
there are motivational differences between 
taking tests for research and for keeps, which 
basis of motivation will elicit test responses 
that more clearly reflect the individual ’s actual 
aptitude? 

Method 


The writer had an opportunity to check 
this with a sample of 40 supervisors who vol- 
unteered initially to participate in a testing 
program aimed at obtaining for research pur- 
Poses a measure of the qualities and charac- 
teristics identifying the group as a whole. 
The supervisors were randomly divided into 
two groups of 20 each, Rough comparability 
was obtained in age, education and experience 
since differences between these means and 
sigmas did not exceed the .05 level of sig- 
nificance. The two groups identified as 1 


and 2 were given the Wonderlic Personnel 
Test Form A, 


1 er ine which basis of moti- 
vation elicited test sc 


The same Procedure w. 
Ing supervisors in Group 2. 


each supervisor in both groups three months 


later showed the criterion to have an esti- 
mated + .89 reliability. Correlations be- 
tween test scores and criterion for Groups 
1 and 2 for both testing situations were ob- 
tained by the rank-differences method. 


Results 


Table 1 shows the mean scores and coca 
for Group 1 and 2 with respect to’ Form / 
and B of the Wonderlic Personnel Test. , 

Whereas the differences in means and sR 
mas were not significant between the first he 
second testing for the Control Group 1; 
Experimental Group 2, believing their Cw 
formance at the second testing would a a 
their opportunity for promotion, increase 
their mean score almost seven points. 


Table 1 


Scores of the Wonderlic Tests 


Group 1 Group 2 q* 
(N =20) (N= 20) 

Form A 78 
Means 19, 19.9 “44 
Sigmas 5.5 5.0 ý 

Form B 

3 
Means 20.0 26.6 6.031 
i 7 63 
Sigmas 5.7 6.4 
a deci- 
* Differences computed before rounding to one dec 
mal place. 


; ag z 5 1 of 
t Indicates significant difference beyond .05 leve 
confidence. 


However, did supervisors in both a 
1 and 2 maintain comparable scores in 3 
two testing situations? The correlations bY 
the tank-differences method between first a? 
second testings were +.76 and + 39, Te 
spectively, for Groups 1 and 2. The forme! 
but not the latter is significantly greater than 


Zero since it exceeds the .05 level of con- 
fidence. 


General] 


F y in- 
b Y, supervisors in Group 1 mai 
tained co; 


mparable absolute and relative 


168 


Motivation Factor in Testing Supervisors 


Scores in both testing situations. Supervisors 
in Group 2 did not maintain absolute and 


relative scores when advised that promotions: 


Would be based on test performance. Inspec- 
tion revealed that several supervisors changed 
tank-positions from highest to lowest and in 
two cases rank values changed while nu- 
merical scores did not. 

The correlations between test scores and 
criterion for Groups 1 and 2 were, respec- 
tively, + 41 and + .34 for the first testing 
and + 37 and + .67 for the second testing. 
Only the last correlation is significantly 
Sreater than zero since it exceeds the .05 level 
of confidence, 

_ These data tend to indicate that an 1M- 
Significant relationship existed between test 
Scores and criterion of over-all performance 
When the tests were administered for purely 
research purposes. However, changing the 
basis of motivation from that of research to 
that of promotion purposes brought about 
à highly significant relationship between test 
Scores and criterion. ; 

Tt might be interesting to mention that 
two men from Group 2 were actually pro- 
Moted since the several supervisors UP for 
Consideration were just by chance in Group 2. 
~OWever, their test scores were not helpful 
w deciding which of the several to promote 
Since all of their scores on the second test 
Were fairly high. But had scores 07 the first 

St, given for purely research purposes; been 
used to aid management in promoting two 
Supervisors, it is doubtful that the two ac- 


169 


tually selected would have been since they 
had two of the lowest scores in their group. 


Summary 


The problem of whether there are differ- 
ences in motivation in taking tests for re- 
search or for promotion purposes was studied 
by giving to a group of supervisors two forms 
of the Wonderlic Personnel Test with a time 
interval between for research purposes. A 
second group took the same two forms but 
the second administration was with reference 
to possible promotion. The following results 
were obtained: 

1. The promotion motivation produced sig- 
nificant increases in the mean score whereas 
the control group showed no such increases. 

2. The promotion motivation changed the 
individual’s relative standing in the experi- 
mental group as shown by the lower correla- 
tions between the two tests than occurred in 
the control group. 

3. Scores motivated by promotion purposes 
had greater validity as indicated by correla- 
tions with a criterion based on ratings of 
over-all performance. 

Although it is very difficult to draw general 
conclusions, the implications of this study 
should serve to sound a note of caution to 
others doing research on aptitude tests in 
industry to take special pains to control the 
factor of motivation. 


Received February 18, 1953. 
Early publication. 


THE JOURNAL OF APPLIED PSYCHOLOGY 
Vol. 37, No. 3, 1953 


The Minnesota Engineering Analogies Test * 


Marvin D. Dunnette 


Industrial Relations Center, University of Minnesota 


Engineers and technically trained person- 
nel are key figures in meeting unprecedented 
demands of our armed forces, defense indus- 
try, and our civilian economy. As a result, 
the country faces a critical shortage of engi- 
neering personnel. In the year 1949-1950, 
a total of 57,159 (20) engineers were gradu- 
ated from the nation’s technical schools. 
These graduates were rapidly absorbed by 
industry. Against an estimated annual re- 
quirement for 30,000 new engineers, the 
yearly crop of engineering graduates, how- 
ever, is rapidly declining. Thus from a total 
of 38,000 graduated in 1951, the estimated 
number of graduates falls to 17,000 for 
1954 (19). 

In view of these figures, the selection of 
engineering students to continue in pursuit 
of graduate degrees becomes a problem of 
primary importance. It js undesirable that 
technical manpower be wasted in the unsuc- 
cessful pursuit of advanced training. In like 
manner, it is important that educators be 
able to identify the most able students in 
order that they may be urged to pursue grad- 
uate work. It has long been recognized by 
engineering faculties that wise selection of 
advanced students would be facilitated by 
development of a short, easily administered 
test with demonstrated validity for the as- 
sessment of potentialities necessary to suc- 
cess in graduate school. This article presents 
the rationale for the use of a special analogies 
test in this assessment task and constitutes a 


description of an exploratory attempt to build 
such an instrument. 


Only within recent 
efforts been made to 
neering curricula. 
tions have not met 


years have systematic 
predict success in engi- 
Usually such investiga- 
with the degree of success 


170 


enjoyed by projects designed to develop pre- 
dictive devices in other fields of academic en- 
deavor. However, studies (1, 3, 4, 9, 11, 16, 
17) concerned with the prediction of — 
in undergraduate engineering have uniform y 
shown certain measures to be of maximum 
utility. It would appear that an ideal com- 
binational measure for the evaluation of a 
person’s aptitude for engineering would In- 
clude measures of previous academic age 
ment (1, 4), general intelligence (16), an 
facility in mathematics (1, 3, 4, 9, 11, 17). 

The problem with which this aaa ae 
was most concerned (i.e., the evaluation 2 
graduating engineers) has been little recog 
nized in the literature. The Graduate pes 
ord Examination has been used extensin 
in several fields, but little information pelar 
ing performance on the G.R.E. to ao 
ment in advanced engineering training E 
available. Learned (5, 6) in discussing oa 
relative merits of the G.R.E. states that a 
theoretical approach of the G.R.E. (i.e; Re 
testing of information gleaned from a variety 
of subject matter fields) has proved success- 
ful in prediction at the higher levels of erie 
demic endeavor. Speer (13), on the at 
hand, feels that the broad generality of a 
subject matter tested by the G.RE. is t F 
very factor which makes it unsuitable for ae 
with engineers. He emphasizes that are 
of capable engineering graduate studen i 
must include a measure of general sme 
ability as well as measures of achieveme? 
in previous work. 

In general, little of a definitive nature has 
been done in the evaluation of graduating 
engineers. There is an indication that suc- 
cess in postgraduate employment is relaten 
to undergraduate grades (10). Itis felt thes 
Proficiency in graduate school can be prê 
dicted best by a test combining a measure © 
general intelligence with some measure i 
previous achievement (13). One releva” 
study (15) has shown that tests requiring 
the ability to perform abstract reasoning af@ 


The Minnesota Engineering Analogies Test 


efficient predictors of success in advanced 
study in the physical sciences. 

Experience with the verbal analogy as a 
test item has shown it to have characteristics 
which would appear to make it an efficient 
device for use in evaluating engineering grad- 
uates. The verbal analogy item is short. A 
test including many such items may thus be 
easily administered in a brief period of time. 
Because the analogy requires the perception 
of relations and the generation of correlate 
relations, it is a measure of abstract intellect 
(12), Furthermore, although it is related to 
Verbal facility, it has also been shown to be 
associated (r= .67, .68) with measures of 
arithmetic reasoning and arithmetic compu- 
tation (18). Factor analyses of verbal anal- 
Ogies tests (14) have indicated high loadings 
Mm V (verbal) and D (deductive) factors. 

e latter factor is most prevalent in tests 
calling for arithmetic reasoning, and number 
Series completion, abilities which are impor- 
pant in predicting success in engineering train- 

A 

In the construction of analogy items, it is 
Possible to include concepts calling for knowl- 
edge in specific subject matter fields. Thus 
analogies tests may be used to measure pre- 
vious achievement. Levine’s study (7, 8) 

ears particularly on this point. He devel- 
Ped an analogies test specific to the subject 
Matter of psychology. He found his test to 
<è a slightly better predictor of achievement 
e Psychology courses than a test of general 
ability such as the Miller Analogies. He con- 
C udes, “At any rate the data obtained in this 
Project would tend to indicate the Tene 

„Exploring the possible uses of special anal- 

Bies tests in other fields” (8, P- 305). $ 
, “Aus, a special analogies test involving en 
Bineering knowledge and concepts may pe an 
Cient instrument in measuring capabilities 
ecessary to success in graduate enp miar 
“cause of this, such a test was constructed 
a _ used in this exploratory evaluation © 
"gineering graduates. A 
ever” May turn out that this type of test would be 


3 : Jacement 
of & Ore important in the selection and pi 


OF enpi ion, electrical, etc-) 
in p 8ineers (sales, research, qesen. SEON the test 


here Siness and industry. For this rea now being 
© Teport “extended and is now x 
scaidateq ina Te established engineering re 


Sea, 
*ch departments. 


171 
The Present Study 


The purpose of this study was to build a 
test applicable to all fields of engineering. 
This decision necessitated drawing items from 
that store of information which can accurately 
be said to comprise the “common-core” of 
academic knowledge among graduating engi- 
neers. An analysis of the curricula in 14 
engineering colleges indicated the so-called 
“common-core” to consist of courses in in- 
organic chemistry, analytic geometry,’ trigo- 
nometry, algebra, differential and integral 
calculus, physics, hydraulics, statics and dy- 
namics, strength of materials, thermodynam- 
ics, and a survey of the basic principles of 
electrical engineering. Items were written for 
each of these subject matter fields. Minute 
details were avoided; only important princi- 
ples basic to the fields were included. By so 
doing, it was hoped that esoteric informa- 
tional content would be ruled out as a deter- 
miner of item difficulty. From the initial 
pool of 135 items, 90 were selected for pre- 
liminary administration. The following are 
examples of the analogies ° which were used: 


Consider a triode: 

Spectators: turnstile: :plate current: 

(1.) cathode; (2.) plate; (3.) anode; 
(4.) grid. 

Pauper:money: :riveted butt joint: 

(1.) bearing stress; (2.) bending stress; 
(3.) tensile stress; (4.) shearing 
stress. 

Diameter :circumference::y = bx: 


(1.) + 9% = 9°; (2) a2 = py; 
ge gf 2 y 
3.) Bt pols (4.) apm 


The 90-item test was administered to 203 
engineering seniors enrolled in G.E, 103, a 
survey course of engineering ethics, which is 
required of all graduating seniors in the In- 
stitute of Technology at the University of 
Minnesota. 

Of the 203 seniors who took the prelim- 
inary form of Minnesota Engineering Analo- 
gies Test, only 91 completed every item on 

2 The correct response for the first example is (4,) 


grid; for the second is (2.) bending stress; and 
the third is (1.) x*+ 9° =7°, 


172 


the test. This was because of the limited 
time afforded (50 minutes) by the length of 
the class period. Each student’s score was, 
therefore, expressed in terms of an accuracy 
index derived by dividing the number of items 
answered correctly by the number answered 
incorrectly. A scatter diagram portraying the 
relation between the number of items at- 
tempted and the accuracy index indicated 
that slow workers worked just as accurately 
as more rapid workers. Thus it was indi- 
cated that ability to finish the test was not 
related to a student’s proficiency in the test. 
Therefore, it did not seem desirable to include 
the speed factor in the analysis of results. 
For item analysis purposes then, the scores 
were expressed in terms of the accuracy 
index. 

Davis’ item analysis chart (2) was used 
to compute biserial validity coefficients. Two 
indexes were computed for each item. The 
first validity index was based on an “internal” 
criterion, the accuracy index. The second 
validity index was based on an “external” cri- 
terion consisting of the over-all honor point 
average ° earned at the University of Minne- 
sota. A total of 63 items exhibiting validity 
coefficients above .10 on both criteria were 
combined into a final form of the test. This 


final form was then administered to 53 grad- 
uate students in engineering. 


Results 


Among the 203 seniors, the correlation be- 
tween honor point average and the accuracy 
index for the 90-item test was .57. A corre- 
lation of this magnitude was considered en- 
couraging in view of the fact the test still 
contained many poorly discriminating items. 

Table 1 shows distributions of the validity 
coefficients obtained in the item analysis. 

The corrected odd-even reliability of the 
63-item test administered to the graduate stu- 
dents was .86. A reliability of this 


Standardized tests. 
sponses made by the 


? The honor point average is calculated 
y L1 on th - 
sis of three honor points for each credit of oe 
for B, one for C and zero for either D or E 


Marvin D. Dunnette 


Table 1 


Magnitudes of Validity Indexes Obtained 


External Validity 


Internal Consistency Criterion (HPA) 


Criterion (R/W) 


Biserial Number of Number of 
r Items Items 
<0 4 12 
0-.10 12 14 
-11-.20 15 26 
21-35 28 32 
>.35 31 6 
Total 90 90 


that many of the distractors, apparently ade- 
quate within the senior group, failed to func- 
tion effectively for graduate students. How- 
ever, for the graduate group, the average 
Davis Difficulty Index (2) was 57. This 
value corresponds to a proportion of successes 
of .63. This indicates that in spite of the 
shrinkage of distractor effectiveness, the test 
was moderately difficult for these high ability 
students. í 

For purposes of comparison, the tests © 
the seniors who finished the test were re- 
scored for the 63 items of the final form. 
Figure 1 shows the distribution of scores fot 
the two groups (seniors and graduate stu- 
dents) on this form. The graduate students 
scored markedly higher having a mean 0 
37.1 and S.D. of 7.24 compared with the 
Senior mean of 28.7 and S.D. of 7.18. The 
critical ratio was 6.76. 

In terms of overlap, only 13 per cent of the 
seniors exceeded the median of graduate wa 
dents, and only 9 per cent of the graduat 
students fell below the median of the seniors. 
The low amount of overlap is a definite indi- 
cation that the test operates in a valid me 
ner to identify the more able engineering 
students. 

But to what extent does the test differen- 
tiate among graduate students with different 
abilities? In order to investigate this ques: 
tion, the graduate student group was divide 
into first, second, and third year students g 
cording to the following plan: Ist year—O- 
quarters; 2nd year—3-6 quarters; and 3" 
yeat—more than 6 quarters. 


The Minnesota Engineering Analogies Test 173 


Table 2 
Differential Performance of Seniors and Graduate Students on the 63-Item Test 
Standard Probability 

Group N Mean Deviation t Level 
Seniors 91 28.7 TAR 3.88 P <.001 
Ist Year Grad. Students 24 34.6 6.26 oe. PE 5 
2nd Year Grad. Students 13 35.3 5-63 2.84 P <.01 
3rd Year Grad. Students 16 424 KiS 


Figure 2 shows the distribution of scores 
within each of these three groups. The first 
and second year students showed similar per- 
formance, but the third year students exhib- 
ited marked superiority. These results are 
important when it is remembered that third 
year students include only the carefully 
Screened Ph.D. candidates; the other two 
8roups consisting of master’s candidates. Ta- 
ble 2 summarizes the performance on the 63- 
Item test of all four groups (seniors, Ist, 2nd, 
3rd year graduate students). In terms of 
Overlap, only 13 per cent of the students in 
the first two years of graduate school reached 
or exceeded the median of 3rd year students. 
n like manner, only 19 per cent of the latter 


group fell below the median of the former. 
These results provide further impressive evi- 
dence of the validity of the 63-item test. To 
the extent that candidates for the Ph.D. de- 
gree are, as a group, more able than other 
students in graduate engineering, the ability 
of the test to identify the more competent 
group is established. 

It may be concluded that the exploratory 
use of the special analogy test has proved it 
to be a feasible device for the evaluation of 
engineering graduates. This conclusion is 
based on the fact that a large number of the 
analogy items discriminated sharply between 
academically superior and inferior students. 
Further support is drawn from the finding of 


Graduating seniors 
N=9 Q=23 
Q: = 29 
Q; = 33 


XX 
XXXXXX X X X 


40 45 50 55 


10 Graduate students 
N = 53 Qi = 32 
Q: = 36 
x Qs = 44 
x 
ret, a 
XX X X XXX X) 
ne y XXX X XXX X 


x xXXXXXXX XXXXX XXXXX XXX X 


10 15 20 
Fic. 1. Distributio 


n of senior and graduate $ 


35. «40 5 50 55 


tudent scores on the 63-item test. 


174 Marvin D. Dunnette 
First year graduate students 
N=24 Q=% 
Q: = 34 
x Q: = 38 
X X 
X x X K Ex 
X XXXX X XXXXX XX X 
10 "15 20 25 30 35 40 45 50 SS 
Second year graduate students 
N=13 Q=31 
Q: = 34 
Qs = 40 
x X 
XXXXXX XX X XX 
10 15 20 25 30 35 40 45 50 55 
Third year graduate students 
N=16 Q=36 
Qe = 44 
Qs = 48 
X 
X X 
XX X XX XXXX XX X 
10 15 20 25 30 35 40 45 50 55 


Fic. 2. Differential distribution of graduate student scores on the 63-item test, 


significant differences between graduating sen- 
iors and graduate students and between grad- 
uate student groups with varying levels of 
ability. It is suggested that further investi- 
gation of the special analogy test may be 
most profitable in the development of instru- 


ments for the assessment of high level engi- 
neering abilities. . 


Summary 


Studies have shown that degree of success 
in engineering studies can be most effectively 
predicted by a combination of measures jn- 
cluding previous academic performance, gen- 
eral intelligence, and mathematical facility. 
The nature of the verbal analogy item sug- 
gested that it is peculiarly fitted to the task 
of assessing the above attributes, 

With this in mind, a 90-item engineering 
analogies test was constructed and adminis- 
tered to 203 engineering seniors, Item analy- 
ses were made using internal consistency and 
external validity criteria. The 63 most dis- 
criminating items were combined and admin- 
istered to 53 graduate students. The odd- 


even reliability of this form was found to be 
‘86 for these highly selected graduate stu- 
dents. 

The results for the 63-item test were eai 
ined with respect to comparisons within ar 
between graduate students and seniors. R 
performance of graduate students was man 
edly superior to that of the seniors. wa 
the graduate group, the performance of tban 
year students (Ph.D. candidates) was supe 
rior to that of first and second year students 
(M.A. candidates). : 

It was concluded that a special analogies 
test effectively assesses engineering abilities. 


Received June 30, 1952. 


References 


1. Berdie, R. F., and Sutter, N. A. Predicting i 
cess of engineering students. J. educ. Psychol 
1950, 41, 184-190, wage 

2. Davis, F. B. Item analysis data. Cambri ae 
Mass.: Graduate School of Education, Ha 
vard, 1949, > 

3. Griffin, C. H., and Borow, H. An engineer I 
and physical science aptitude test. J. a? 
Psychol., 1944, 28, 376-387. 


on 


11, 


12; 


. Levine, A. S. 


- Pierson, G. A., Jr. 


The Minnesota Engineering Analogies Test 


. Jones, V. Prediction of student success in an en- 


gineering college. Amer. Psychologist, 1948, 3, 


295. 


. Learned, W. S. Measurement of student knowl- 


edge as a basis for graduate study. Carnegie 
Foundation for the Advancement of Teaching, 
Thirty-third Annual Report, pp. 62-68. 


. Learned, W. S. The Graduate Record Examina- 


tion; a memorandum on the general character 
and use of the examination including a sum- 
mary of initial studies of its validity. Car- 
negie Foundation for the Advancement of 
Teaching. 

A psycho-analogies test as an 
evaluation instrument for psychology students. 
Ph.D. dissertation, University of Minnesota, 
1950. 


. Levine, A, S. Minnesota Psycho-analogies Test. 


J. appl. Psychol., 1950, 34, 300-305. 


. Lord, F. M., Cowles, J. T., and Cynamon, M. 


The Pre-engineering Inventory as a predictor 

of success in enginecring colleges. J. appl. 

Psychol., 1950, 34, 30-39. i 

School marks and success in 
engineering. Educ. & Psychol. Measmt., 1947, 
7, 612-617. : 

Sackett, R. L. Discovery of engineering talent. 
J. Engng. Educ., 1944, 35, 180-183. , 

Spearman, C. The nature of intelligence and the 


13 


14. 


15. 


16. 


i7: 


18. 


19. 


20. 


175 

principles of cognition. London: 
and Co., Ltd., 1927. 

Speer, G. S. The use of the Graduate Record 
Examination in the selection of graduate en- 
gineering students. J. Engng. Educ., 1946, 37, 
313-318. 

Thurstone, L. L. Experimental study of simple 
structure. Psychometrika, 1940, 5, 153-168. 
Travers, R. M. W., and Wallace, W. L. The as- 
sessment of the academic aptitude of the 
graduate student. Educ. & Psychol. Measmt., 

1950, 10, 371-379. 

Treumann, M. J., and Sullivan, B. A. Use of 
the Engineering and Physical Science Aptitude 
Test as a predictor of academic achievement of 
freshmen engineering students. J. educ. Res., 
1949, 43, 129-133. 

Vaughn, K. W. The Yale Scholastic Aptitude 
Tests as predictors of success in the college of 
engineering. J. Engng. Educ., 1944, 34, 572- 
582. 

Weisenburg, T., Roe, A., and McBride, K. E. 
Adult intelligence. New York: Common- 
wealth Fund, 1936. 

Wood, Helen, and Cain, R. W. Effect of defense 
program on employment outlook in engineer- 
ing. Supplement to bulletin No. 968. United 
States Dept. of Labor, August, 1951. 

World Almanac & Book of Facts for 1952. New 
York World Telegram & Sun. Pp. 580. 


McMillan 


Tue JOURNAL OF APPLIED PSYCHOLOGY 
Vol. 37, No. 3, 1953 


The Humm-Wadsworth Temperament Scale as an Indicator of 
the “Problem” Employee 


A. R. Gilliland and S. E. Newman * 


Northwestern University 


The Humm-Wadsworth Temperament Scale 
was first published in 1935. Its primary pur- 
pose was for use as an aid in industrial selec- 
tion. The scale consists of 318 questions, but 
of these only about half are scored. The 
others are for use as a “setting” for the scored 
items. The scale is based upon the Rosanoff 
classification of Personality and purports to 
measure seven different components. The 
test was standardized on seven groups of sub- 
jects each representing a relatively pure type 
of that component. A highly complicated 
method of scoring and validation has been de- 
vised. Suffice it to say that the split half re- 
liability of the various components varied 
from .70 to .90 and the validity as checked 
against new criterion groups was .85 to .98 as 
reported by the authors (1). The reliabil- 
ities have been rechecked by Dysinger (4) by 
the test-retest method and found to be even 
higher than those reported by the authors. 
Most of the components are independently 
variable; only two components, manic and 
depressive, show high intercorrelations ve 
.88). 

The scale has been widely used with col- 
lege students, psychotic groups, and in indus- 
try. No attempt will be made to review all 
these studies. Reed and Wittman (5) gave 
the scale to 477 Elgin Hospital patients and 
compared the scores with a normal control 
group. Only the normal and cycloid com- 
ponents were significantly different for the 
two groups. Dorcus ( 3) used the scale with 
an industrial group. He reports that it cor- 
rectly diagnosed 73% of the poor group and 
65% of the superior group. 


In the present study the Humm 
scale was given to the employees of 
large industrial organization 
“white collar” workers, The 


tered approximately ten years 


-Wadsworth 
a relatively 
employing largely 
scale was adminis- 
ago and the evalua- 


NR: Human Resources Research Center, Keesler AFB. 
iss. s 


176 


tion was made about nine years later. The scales 
for 405 employees who constituted those, with 
surnames beginning with letters before “I” in the 
alphabet were scored. This should constitute a 
random sample from approximately half the popu- 
lation. Of this group, 191 were still employed 
and rated as “successful” or “satisfactory.” An- 
other group of 139 had withdrawn from the eony 
pany but without any unfavorable service no 
Another group of 75 had either been dismisse¢ 
or resigned while on probation. These are classi- 
fied as “undesirable.” Using a method of score 
evaluation as nearly as possible like that used by 
Humm (2) in his study of Los Angeles police, 
men, and checking the method further, as bes 
we could in a personal conference with, the au- 
thor, the 405 employees were classified in terms 
of their Integration Index and Component Con- 
trol Measure into five groups—very good risks. 
good, questionable, poor, and very poor. re 
with Integration Indices and Component canre 
Measures all above 5, for example, were classifie 
as very good risks. Those with at least two rat- 
ings as low as 1 were called very poor risks. 


Of the employed group of 191 still with the 
company and doing satisfactory work, 9.4% 
received a very good rating on the test and, 
5.7% received a very poor scale rating. of 
the 139 no longer employed but with no evi- 
dence about their success, 12.2% were rated 
very good by the test and 5.8% as very poors 
Of the 75 who had been dismissed or with- 
drew for cause, 12.0% were classified as very 
good risks by the scale and 5.4% as very poor 


risks. Thus it is apparent that these results 
show no difference between the three em- 
ployed groups in terms of scores and the 
scale. 


As another method of evaluation, the data 
were arranged in a 3 x 5 table with three 
Stoups of employees in terms of their work 
record as one axis and five degrees of success 
on the scale in terms of the Integration Index 
and Component Control Measure as the other 
axis. From this table a chi square as a test 
of deviation from the null hypothesis was cal- 
culated. A chi square of 5.93 was obtained. 


The Humm-Wadsworth Temperament Scale 


With eight degrees of freedom these data gave 
a p of .65. That is, so great a difference as 
this would occur by chance 65 times out of 
100. When the middle group who had left 
the company between the time of testing and 
the time the study was made was omitted, 
no important change in the relationship was 
apparent. 

Inspection of the seven components of the 
test showed no component in which there was 
a significant difference between the satisfac- 
tory and unsatisfactory employees. An ex- 
amination of the Integration Index which is 
a summary value obtained from the test also 
Showed no difference between these two 
groups. 

The failure of the test to differentiate the 
Satisfactory from the unsatisfactory workers 
by any of the above methods may be due to 
any one or a combination of the following: 
(1) the test may not adequately measure 
the Components it purports to measure, (2) 
these components may not be essential. ele- 
Ments for success in this industry; and (3) 
the company cannot distinguish between sat- 
'sfactory and unsatisfactory workers. 


177 


This study does not prove that the Humm- 
Wadsworth -scale may not be successful in 
selecting workers in some industries but it 
certainly gave no evidence of success in this 
situation. 


Received August 4, 1952. 


References 


1. Humm, D. G., and Humm, Katherine A. Validity 
of the Humm-Wadsworth Temperament Scale 
with consideration of the effects of the sub- 
jects’s response bias. J. Psychol., 1944, 18, 
56-65. 

2. Humm, D. G., and Humm, Katherine A. Humm- 
Wadsworth Temperament Scale appraisals 
compared with criteria of job success in the 
Los Angeles Police Department. J. Psychol., 
1950, 30, 63-75. 

3. Dorcus, R. A brief study of the Humm-Wads- 
worth Temperament Scale and the Guilford- 
Martin Personality Inventory in an industrial 
situation. J. appl. Psychol., 1944, 28, 302- 
307. 

4. Dysinger, D. W. A critique of the Humm-Wads- 
worth Temperament Scale. J. abn. soc. Psy- 
chol., 1939, 34, 73-83. 

5. Reed, P. H., and Wittman, P. “Blind” diagnoses 

“on several personality questionnaires checked 
with each other and the psychiatric diagnoses. 
Psychol. Bull., 1942, 39, 592. 


THE JOURNAL or APPLIED PSYCHOLOGY 
Vol. 37, No. 3, 1953 


The Prediction of Success and Failure in Elementary Foreign 
Language Courses 


Harold C. Peters * 


The Pennsylvania State College 


This study is concerned with the prediction 
of success and failure in the elementary 
courses in French, Spanish, and German at 
the Pennsylvania State College. Predictions 
were made on the basis of scores on the Penn- 
sylvania State College Academic Aptitude 
Examination (3), parts one and two. Sepa- 
rate predictions were made for each of the 
above mentioned languages. 


Procedure 


The Subjects. The subjects in this study 
were all the freshmen in the Pennsylvania 
State College who were enrolled in the ele- 
mentary courses in French, Spanish, and 
German in September 1951, and had taken 
the Pennsylvania State College Academic 
Aptitude Examination. The total number 
of subjects is 443 divided among the three 
languages in the following manner: (1) 
French—47; (2) Spanish—189; and (3) 
German—207. 

Since the study is directed toward pre- 
dicting success it was felt that freshmen 
would be the best subjects. According to 
Feder (2) the function of prediction in edu- 
cation is to facilitate guidance, and, if it can 
be effective, educational guidance is most val- 
uable when applied to freshmen who are be- 
ginning their academic careers, 

The Criterion. The criterion used for suc- 
cess was the teachers’ grades in the three 
foreign language courses. The grades at the 
Pennsylvania State College range from — 2 
to 3 with the 3 being the highest possible 
grade which can be attained in a course, and 
the — 2 being the lowest failing grade. A 
grade of — 1 is also given and this too is a 
failing grade. Since it was felt that those 
students who received a grade of O (the low- 
est passing grade) had not achieved more 

* The author wishes to ex 


tion to Dr, William U, Sn; 
man under whose directio 


press his sincere apprecia- 
yder and Dr, Ila H. Geh- 
n this study was done, 


178 


than the barest minimum of success, the stu- 
dents who received such a grade were placed 
in the failing group. The composition of the 
groups then becomes as follows: (1) Passing 
—grades 3, 2, and 1; and (2) Failing—grades 
— 2, — 1, and 0. 

The Predictive Instrument. The instru- 
ment used in this study was the Pennsylvania 
State College Academic Aptitude Examina- 
tion, parts one and two. The verbal nan 
of these two parts (vocabulary and agi 
reading) does not necessarily indicate tha 
they would be particularly useful in the pre- 
diction of success in the study of a foreign 
language. However, it seems that the tw 
or more skills or achievements involved j 
these tests might be expected to have a direc 
relationship to language skills. All items a 
both tests are of a multiple choice nature wit 
five possible choices. 

Bernard (1) feels that the “learning of k 
foreign language consists fundamentally 1” 
the acquisition of an additional set of sym 
bols for old, familiar meanings. . . 2 oo 
the most pressing need for the student is t A 
knowledge of the meaning of these new SY! is 
bols, the preponderant importance of — f 
ulary becomes at once apparent.” UAE 
(4) states that “presumably there should | 
some relationship between the size of Englis” 
vocabulary and the ability to learn a nev 
language.” 

Considering the importance of vocabellt 
in the learning of a foreign language it is te 
that the measurement of the skill ye 
in learning a vocabulary will be of soni 
value in Predicting success in learning a ÍO $ 
eign language. Since this cannot be gox 
directly, we used a measure of proficiency 5 
English vocabulary as being indicative of t g 
ability to learn vocabulary, with the feelin 
that the same skills involved in developiné 
the English vocabulary are operating in lear” 
ing the vocabulary of a foreign language- 


Prediction of Success and Failure in Language Courses 


_The Paragraph reading part of the test in- 
dicates not only the subject’s ability to read 
a paragraph, but also his ability to under- 
Stand what he has read, as measured by his 
answers to a set of questions concerning the 
Subject matter of the paragraph. In being 
able to understand the meaning of a para- 
graph the subject must have an adequate vo- 
cabulary, must be able to learn the meanings 
of new words from their context, and must 
have some knowledge of grammar. These 
skills are of importance also in the learning 
and mastery of a foreign language. 

_ Experimental Design. The procedure used 
n this study was as follows: The grades of 
all freshmen registered in the courses in ele- 
Mentary French, Spanish, and German at the 
ennsylvania State College were collected 
ftom the respective language departments. 
he grades were divided into two groups, 
With an equal number of grades of each lan- 
8uage in both groups. This was done by 
Selecting every other student (in each lan- 
guage group) from a list of students, ar- 
ranged in order of descending magnitude of 
8tades, and placing him in one group. The 
er group was composed of the remaining 
Students, ‘This method insured an approxi- 
mately equal distribution of grades for each 
8toup. Of the 507 students in these two 
Broups, 443 had taken the Academic Aptitude 

“xamination and their scores had been re- 
corded by the psychology department. These 
Sores were then collected and the analysis 

a 

hat was sought was a | 
On the Academic Aptitude Examination (parts 
ne and two) below which are found those 
Students who cannot make passing grades S 
Stade of O was considered to be a éailure, or: 

“asons previously discussed) in their a 

Suage courses, 
is cut-off point was determined on g 

© two groups (many points were ue 
Hi the one yielding the best prediction was 
ected) and its validity was determine y 


point, or score, 


festing į dents com- 
sa 8 lts icability to the stu 
Prising ipie] pine Different cut-off 


Points were located for each language, with 


me,€XPectation that they would be appros 
Ately the same for all the languages: 


179 


Each of the two parts of the test was used 
separately in finding the cut-off points, and 
the two tests were also combined and a cut- 
off point on the combined score was located 
for each language. 

Statistical Analysis. The statistic which 
was computed was the significance of differ- 
ence between per cents failing above and be- 
low the various cut-off points which were 
established (¢ test). 


Results 


The vocabulary test proved to be a valid 
differentiator between students who achieved 
success, and students who failed in foreign 
language study. Table 1 gives the results of 
the application of the cut-off scores to the 
test group in each of the three languages. It 
can be seen that the vocabulary test was most 
successful in distinguishing between success- 
ful and unsuccessful students when applied to 
the Spanish group. Here, of the students 
who fell below the cut-off score 76.9% failed, 
while only 35.6% of those exceeding the cut- 
off score failed the Spanish course. This dif- 
ference was significant at the .001 level. Only 
with French is the difference not significant 
at the one per cent level. 

To determine the validity of the vocabu- 
lary test as a predictor, the various cut-off 
scores were applied to the second group (the 
cross validation group). Table 1 shows that 
there was no decrease in the significance of 
difference in the Spanish group, but in the 
other languages a decrease was found. 

The lack of significance in the French group 
can probably be attributed to the compara- 
tively small number of subjects in the test 
and validation groups of this language. The 
effect of these small numbers is revealed in 
the relatively high o’s of difference found in 
these groups. These were found to be almost 
double those of the other groups. This weak- 
ness was found to be operating when predic- 
tions were made with either of the tests as 
well as when the tests were combined to form 
a single predictor. 

The paragraph reading test proved to be 
most efficient in predicting success and fail- 
ure in German. Of the students who fell be- 
low the cut-off score, 63.8% failed German, 


Table 1 


The Predictors of Success and Failure 


Combined Tests 


Vocabulary 


Paragraph Reading 


Test Group Val. Group Test Group Val, Group Test Group Val. Group 
Fr. Sp. Ger. Fr. Sp. Ger. Fr. Sp. Ger. Fr. Sp. Ger. Fr. Sp. Ger. Fr. Sp. Ger. 
Score 60 54 62 60 54 62 29 27 29 29 27 29 92 79 87 92 79 87 
N 24 «OF 10% 23 92 103 24 97 104 23 9% 103 24 97 104 23 922 103 
N above score 
Failing 2 16 15 3 9 16 2 14 11 3 10 16 1 15 16 2 8 17 
Passing 9 29 32 7 27 33 7 22 35 7 26 32 27 35 7 28 34 
N below score 
Failing 8 40 33 6 41 31 8 42 37 6 40 31 9 41 32 7 42 30 
Passing 5 B 24 7 iS: 23 rd 19 21 7 16 24 7 14 21 7 14 22 
% failing 
Abovescore 18.1 35.6 31.9 30.0 25.0 32.7 22.2 38.9 23.9 30.0 278 33.3 12.5 338 31.4 22.2 22.2 33.3 
Belowscore 61.5 76.9 57.9 46.2 73.2 57.4 53.3 68.9 63.8 46.2 714 56.4 56.2 74.5 60.4 50.0 75.0 57.7 
Difference 43.4 41.3 26.0 16.2 48.2 24,7 31.1 30.0 39.9 16.2 43.6 23.1 43.7 40.7 29.0 27.8 52.8 24.4 
o difference 17.80 9.19 943 17.33 9.33 9.49 16.62 10.60 891 17,33 9.61 9,54 17.05 9.39 9.35 17.30 9.03 9.47 
above above above 
P .05 .001 01 A .001 .02 A .01 .001 R] .001 = 02 02 001 01 al 001 = .02 


Ost 


$4343d `) pjo«D yy 


Prediction of Success and Failure in Language Courses 181 


Among the students who exceeded the cut-off 
Score only 23.9% failed. This difference was 
found to be significant at the .001 level. 

In the cross validation group, however, the 
Steatest success in prediction was made with 
the Spanish group, and once again the least 
Success was found with the French group. 

Combining the vocabulary and paragraph 
Teading tests does not yield a marked im- 
Provement in prediction of success and fail- 
ure, as will be revealed by an examination of 
Table 1, Especially in the cross validation 
Stoup are the results found to be almost ex- 
actly those attained when the two tests were 
used separately, 


Summary 


The object of this study was to determine 
the efficiency of parts one (vocabulary) and 
two (paragraph reading) of the Pennsylvania 
State College Academic Aptitude Examina- 
tion in Predicting success and failure in the 
elementary courses in the modern foreign 
anguages, 

The results of this study show: 

1, The greatest success in prediction was 
achieved with the Spanish group. 

2. Success in French was most difficult to 
Predict. This was attributed to the small 
Number of subjects in this group. : 

3. There was very little difference in the 
ficiency of the predictive instruments. The 
combined tests were generally most successful 
and the vocabulary test probably somewhat 

Ore effective than the paragraph reading. 

his study demonstrated that it is possible 
© Predict success and failure in the modern 


foreign languages. It was further demon- 
strated that tests of vocabulary and para- 
graph reading can be used to make this pre- 
diction. 

It is suggested that the college administra- 
tion take the responsibility for selecting a 
method of giving the students with low lan- 
guage aptitude (as measured by any of the 
instruments used in this study) an oppor- 
tunity to derive some value out of foreign 
language study. This would no doubt in- 
volve some special treatment such as can be 
provided by special classes. 

The highly significant results found in this 
study indicate that procedures such as these 
could be applied in schools other than the 
Pennsylvania State College. It is likely, how- 
ever, that each school would find it expedi- 
tious to determine for itself, the most effi- 
cient predictive score. If necessary, individ- 
ual schools could substitute other tests of a 
similar nature with which comparable results 
might be obtained. 


Received July 15, 1952. 


References 


1. Bernard, W. Psychological principles of language 
learning and the bilingual reading method. 
Mod. Lang. J., 1951, 35, 87-96. 

2. Feder, D. D. An evaluation of some problems in 
the prediction of achievement at the college 
level. J. educ. Psychol., 1935, 26, 597-603. 

3. Moore, B. V., and Castore, G. F. The Pennsyl- 
vania State College Academic Aptitude Ex- 
amination, 1947 revision. The Penn. State 
Coll., 1948. 

. Symonds, P. M. A modern foreign language test. 
Publications of the American and Canadian 
Committees on Modern Languages, vol. 14, 
New York: The Macmillan Co., 1920. 


> 


THE JOURNAL or APPLIED PSYCHOLOGY 
Vol. 37, No. 3, 1953 


Predicting Grades in Advanced College Mathematics 


John R. Kinzer and Lydia Greene Kinzer 
The Ohio State University 


This is a study of 1,244 students, of whom 
78 were women, who took college algebra 


and of their success in subsequent courses in 
mathematics. 


dents until 29 remained at the end of seven 
courses in the prescribed sequence. Twenty- 
five of these remaining 29 students had been 
graduated at the time of this writing. 

The seven courses in mathematics making 


up the sequence are briefly described as fol- 
lows: 


421. College Algebra. Five credit hours. . . 

422. Trigonometry. Five credit hours... . 
Prerequisite, Mathematics 421... . 

423. Analytic Geometry. Five credit hours. 
. . . Prerequisite, Mathematics 422... . 

441. Calculus. Five credit hours... . Pre- 
requisite, Mathematics 423. Differentiation of 
algebraic forms, with applications; successive dif- 

. ferentiation; differentiation of transcendental 
functions; parametric equations, differentials ; 


curvature; theorem of mean value; indeterminate 
forms. 


442. Calculus. Five credit hours... , Pre- 
requisite, Mathematics 441. Integration of stand- 
ard elementary forms, and integration by various 
devices; definite integrals; application to geome- 
try and physics. 

443. Calculus. Five credit hours. . . . Pre- 
requisite, Mathematics 442. Numerical series 
and power series; differential equations; hyper- 
bolic functions; partial differentiation; multiple 
integrals, and applications. 

601. Advanced Calculus. 

. . . Prerequisite, Mathematic 
of limits, functions, 
meaning of ordinary 


Five credit hours. 
atics 443. The theory 
continuity; definition and 


and partial derivatives: 
definition of definite integrals, Proper and m. 


proper; fundamental theorem of the integral cal- 
culus; functions defined as integrals containing a 
parameter; mean value theorems; convergence of 
series; power series; implicit functions. 


The course in advanced calculus was chosen 
as the culminating point because it is thought 
to represent the type of thinking believed 
to be important in the graduate study of 
mathematics. 

The sample is 1,244 students who were 
enrolled in college algebra at the Ohio State 
University in the autumn quarter of 1946, 


The study followed these stu- ` 


182 


The data were collected after most of the 
students were presumed to have had time to 
complete the entire sequence. P 

In addition to course grades, percentile 
scores on the Ohio State Psychological Ex- 
amination (OSPE) were included in the com- 
putations. 

For the 29 students who completed the 
entire seven-course sequence some personal 
data are given to describe this sample m 
greater detail. 

Table 1 presents the intercorrelation mat- 
rices of OSPE and mathematics course grades. 
These are presented in a manner such that 
one can easily see how the coefficients of cot 
relation change as the sample decreases m 
size. 

The means presented in Table 1 show that 
the better students in the early courses ten 
to go on into the more advanced conte 
The mean OSPE gradually increases, unti 
the last group is reached where there is # 
sharp upward trend. The mean OSPE of the 
29 students in the advanced calculus cours 
is 79.1 percentile. ; 

In Table 2 are presented the regressio” 
coefficients and coefficients of multiple CO” 
relation. Although some of the coefficients 
in the regression equation are negative, nong 
of the negative coefficients is significantly dif- 
ferent from zero, i 

The 29 personnel cards filled out during 
Freshman Week by the 28 men and one 
woman who took Advanced Calculus wer? 
examined in an attempt to discover some 
clues as to success in advanced mathematics 
Of the 29 persons who took seven courses i 
mathematics, 12 made A or B grades in A 
vanced Calculus, 17 made C, D or E grade 
Chi-square was used to test for independen© 


of the grade classification and the followi” 
classifications: 


l. Like mathematics—like some other st” 
Ject (x? = 0.000). 


Predicting Grades in Advanced College Mathematics 183 


Table 1 


Coefficients of Correlation, Means, and Standard Deviations 


Note: OSPE scores are percentiles. Grades are expressed on the basis of A= 4, B = 3,C = 2, D = 1. 


Mathematics Courses 


OSPE 421 422 «423441 å 2 è 48B ool 
N x1 Xe X3 E Xs Xe X; y 
OSPE 1,244 xi i 3 
978 27 -24 
693 21 20 20 
536 .21 19 .20 .19 
416 20 20 2 1 ià 
326 i 20 3 25 å J A 
29 is 26 43 OO 06 =de 36 
Math 421 078 xe 58 
a 54 55 
536 52. S 
a6 51 5446s 
326 1 Si AT 39 34 
50 ze 20 A 34 ii 19 
Math 422 693. x3 - 
a 52 49 
RA 49 B 8B 
a 46 46 40 3 
29 54 24 23 24 a8 
Math 423 536 xı S s 
z $o 
“o 
29 ; 44 40 43 
59 
Math 441 416 xs ji 57 45 
326 “Og 
Ba -65 25 48 
M 56 
ath 442 326 x6 6 64 
29 ` 
Math 443 29 x7 a 
M 2.2 
ean 1,244 pin 24 2.3 
978 rae 25 26 21 
693 a zé 27 Za 22 
536 eats 298 28 2S 24 «23 
416 AA z0- 20 2r 25 25 28 
326 oe ga 3 32), 
79.1 . 
1.2 
pandara 1,244 264 i7 ie 
Viation 078 a 10 R0 A 
693 me 10 10 it 1d 
536 4 10 10 10 10 12 
416 ð o go Or O% xo 1 
326 y 0.8 0.8 0.8 0.8 i 1.0 1.2 
a 21.9 i 


184 J. R. Kinzer and L. G. Kinzer 
Table 2 
Regression Coefficients and Ccefficients of Multiple Correlation a = 
Mathematics 
z 4 Coefficient 
Mathe- OSPE 421 422 423 441 442 N 443 a Multiple 
pene N = xs x: Xt Xs Xo Xt Constant Correlation 
'ourses D x xX: x 3 3 ic 
Ff aed 
421 1,244 ye 0144** eo a 
422 978 ys -0040** aoe J a 
423 693 ys -0025 39 ga —.10 po 
441 536 ys 0017 Age’ 2g AD 02 pom 
442 416 ye  —.0004 O7 BF 25 = .46** —.00 n a 
443 326 y7 —.0005 04 03 25°" 12 .41** 22 OF 
601 29 ys -0116 —.28 —.06 36 O01 S87** — 36 —.05 fo 


* Significantly different from zero (5% level). 
** Significantly different from zero (1% level). 


2. Dislike mathematics—dislike some other 
subject (x? = 0.000). 

3. One or both parents dead—both parents 
living (x? = 0.008). 

4. Family fewer than 4 children—family 
of four or more children (x? = 1.543). 

5. Went to college the year following H. S. 
—additional time lapse between H. 5; 
and college (x? = 4.138, significant at 
5% level). 

6. Live alone—do not live alone (x= 
5.250, significant at 5% level). 


The significant values of x” tend to indicate 
that high grades in Advanced Calculus are 
associated with going to college immediately 
after high school graduation and with room- 
ing alone. 

At the time of this writing, 25 of the 29 
students had been graduated with a mean 


cumulative point hour ratio of 2.89 (A= 4). 
The correlation between OSPE percentiles 
and cumulative point hour ratio at graduation 
is .20, but the correlation between carnal 
point hour ratio and grades in Advanced Cal- 
culus alone is .63. 


Summary 


l. This study reports coefficients of pe 
relation, means and standard deviations © 
mathematics course grades and Ohio State 
Psychological Examination percentiles. 

2. Regression equations for predicting Suc- 
cess in mathematics courses are presented. 

3. Some personal data from students’ of 
ficial records are discussed briefly. 


Received January 19, 1953, 
Early publication, 


Tue JOURNAL or APPLI yi 
vol aoma oe ED PsycHoLocy 


A New Method For Determining Readability of Standardized 
Tests * 


Fritz W. Forbes 


University of Hawaii 


and 
William C. Cottle 


University of Kansas 


Interest in the measurement of readability 
has been growing steadily since the first basic 
research was done by Vogel and Washburne 
(12) in 1928. Klare (8) in 1950 estimated 
that 34 formulas or methods for determining 
the reading difficulty of printed material had 
een devised. Five of the more recently de- 
veloped formulas in widespread use have been 
Singled out for critical analyses here: the 

ale-Chall, Flesch, Lorge, Lewerenz, and 

Oakam formulas. 

_ Little has been done to ascertain the read- 
mg level necessary to understand the content 
of standardized testing materials. Johnson 
and Bond (7) have written one of the few 
y icles on this specific topic. In their paper 
the Flesch formula was used for testing read- 
Mg ease of nine standardized tests in common 
Use in V, A, Advisement Centers. The gen- 
ĉral conclusion was that many tests are being 
Stemistered to people who do not under- 
. them because the readability of the 
*sts is too difficult. 

res tte (10) made a stt 
usi Ing difficulty of six intere ; 
Sing the Flesch formula. High correlation 
tween the Flesch formula and other for- 
Sı ulas was reported. Roeber (9) compared 
Sven interest inventories as to word usage- 

e percentage of occurrence of different 

ords appearing in the inventories was com- 


Puted, He found a large number of words 
T Ond the understanding of ninth graders. 


tice! his recommendation for 4 eee a 

the 'S does appear in a later form 0 
Mventories. 
esting instruments are becomng 
numerous that persons who 


dissertat 
der the 


study of the relative 
erest inventories 


so varied 
use them 


* : t 
A done a 
the bstract of Forbes’ Ed.D. rection of 


Cotte, Versity of Kansas Un 


need every help possible to determine the use- 
fulness of the instruments for particular pop- 


ulations. 
This study was carried out in order to de- 


termine objectively the reading difficulty of 
standardized tests commonly used in counsel- 
ing and to develop a new and simplified 
method for determining the reading level of 
these standardized tests. It is believed that 
this simplified readability method will also 


-be found useful in measuring the readability 


185 


of public opinion polling questions and of 
headlines and slogans in advertising copy. 


Method 


Five of the more popular techniques for 
evaluating the reading difficulty of printed 
matter were critically analyzed in relation to 
standardized tests. The Dale-Chall, Flesch, 
Lorge, Lewerenz, and Yoakam formulas were 
applied to 27 selected standardized tests com- 
monly used for counseling at various educa- 
tional levels. The mean score of reading dif- 
ficulty was then obtained. 

The choice of the tests to be used in this 
study was determined from previous studies 
made upon test preference and from the 
newer tests indicated by the records of the 
University of Kansas Guidance Bureau. 

Berkshire and others (2) have tabulated 
responses that were received from 290 test- 
ing centers. They concluded that there is 
general agreement on approximately 15 to 
20 tests as being common to guidance test- 
ing. Beyond this point test preference varies 
widely. Tests were chosen from this list for 
analysis if they were reported by at least 25 
of the reporting centers as being one of the 
most commonly used tests. This same study 
shows in tabular form the results obtained 


186 


F. W. Forbes and W. C. Cottle 


Table 1 


Comparative Grade Placement of Selected Standardized Tests According to Various Readability 
Formulas and the Application of the New Forbes Formula to Items 
and Instructions for these Tests 


Forbes 
Av. of SS ee 
Dale- Lewer- Five Instruc- 

Test Chall Flesch Lorge enz Yoakam Formulas Items tions 
MMPI 55 6.1 4.4 6.2 4.8 5.4 6.5 7.3 
School Inventory 5.6 6.3 5.0 7.6 33 S5 5.0 7.2 
Calif. Test Pers. 6.2 7.2 5.3 5.7 6.6 6.2 71 63 
AGCT 5.9 11.0 BY 8.0 4.0 6.9 62 ól 
Guilford-Zimmerman 6.0 75 5.6 8.2 3 6.9 7.4 8.3 
Otis Q-S 6.1 7A 5.8 8.2 WH 7.0 7.6 6.4 
Adjustment Inv. 64 9.1 6.1 17 8.0 w 78 6.1 
Minn. Pers. Scale 6.5 9.1 6.2 6.5 10.8 7.8 8.8 8.2 
Mooney 6.1 8.3 6.0 8.7 11.0 8.1 89 72 
Bernreuter 6.7 8.4 6.7 7.3 11.9 8.2 91 70 
CTMM 7.2 10.8 8.8 8.0 6.7 8.3 9.1 6.3 
Stanford Ach. 7.0 8.4 7.0 6.4 14.5 8.7 10.6 6.7 
Kuder CM 7.3 8.5 7.7 7.8 12.5 8.7 97 83 
Otis Employ. 6.3 7.9 6.3 9.3 14.3 8.8 9.1 6.4 
Henmon-Nelson 6.6 9.2 6.1 11.3 12.6 9.1 9.8 5.0 
Iowa Silent 8.0 11.4 7.9 9.1 10.1 9.3 9,3 6.7 
Lee-Thorpe 8.0 10.0 7.9 78 13.6 9.5 10.3 7.6 
Kuder BB 7.6 9.2 7.6 8.5 14.5 05 10.7 7.9 
SRA Reading 8.3 13.2 8.5 9.9 12.6 10.5 9.8 6.8 
Cleeton 8.4 14.4 7.4 8.3 16.0* 10.9 125 6.9 
Strong Voc. Int. 8.9 15.8 6.7 12.0 13.3 114 10.2 85 
Coop. Reading 8.7 14.0 9.9 10.4 140 i14 104 78 
Minn. Reading 9.0 13.2 9.4 9.7 16.0* 115 1 19 9.2 
Ohio State Psy. 10.7 16.5* 9.6 9.0 11.8 115 is 73 
ACE 8.5 16.1 8.5 9.4 16.0" 117 127 76 
Coop. Gen. Cult. 8.5 15.6 10.7 10.3 16.0* 122 Col. 7.5 
Study of Values 9.1 16.1 9.6 12.7 16.0* 127 Er 9.6 


* Estimate of the grade, the formulas did not indica 


from three other studies by Brophy and Long 
(3), Darley and Marquis (6), and Baker and 
Peatman (1). The findings from these three 
studies were comparable to those of Berk- 
shire and others. 

Standardized testing instruments that have 
become popular since the appearance of the 
above articles were checked for the frequency 
of their use at the University of Kansas Guid- 
ance Bureau. Nine tests were added to the 
original list to be analyzed. Six of these were 
published after the above cited studies on 
preference had been made. Two of the re- 
maining three that were added w 
tests, because of the nature of 
study. The one remaining test, 


ere reading 
the present 
the Minne- 


te grades at these levels, 


sota Personality Scale (Men) 1941, was 
added at the discretion of the writers. TB? 
tests were chosen from five of the genera 
areas of testing listed by the Third Mento 
Measurements Yearbook (4): Character 2” 
Personality, Intelligence (group), Interests 
Achievement Batteries, and Reading. They 
are listed in Table Ls 

The five formulas selected for study K 
the more recently developed techniques ie 
measuring readability. They present seve", 
factors which have been used for determini”? 
reading difficulty of printed matter, nam? ae 
Pe aioulty, prepositional phrases, 7 
ii € ‘ength, number of syllables per 
undred words, number of different W° 


e 


-L a 


Determining Readability of Standardized Tests 187 


and percentage of words beginning with cer- 
tain letters. Each of the formulas has been 
carefully developed and exhibits a fair de- 
gree of reliability and validity. In the Yoa- 
kam formula, “hard words” vary in difficulty 
according to their frequency and range of 
Occurrence above the most common four 
thousand words. The Lewerenz formula is 
based solely on word difficulty, basing the 
vocabulary difficulty on words with certain 
initial letters. The Flesch formula considers 
the length of the word the index of difficulty 
of that word, the more syllables a word has 
more difficult it is. The Dale-Chall for- 
Mula uses a list of three thousand words, any 
Word not appearing on this list is considered 
difficult. The Lorge formula considers as a 
hard word” any word other than the 769 
Words that are common to the first one thou- 
Sand most frequent English words on the 
horndike list and the first thousand most 
requent words known by children entering 
€ first grade. 
| On the basis of the facts mentioned above, 
ìt would seem that the Lorge formula inter- 
Pretation of “hard words” is too simple and 
'mited for the purpose outlined here; the 
Dale-Chall method approaches a more real- 
'stic and practical definition of difficult 
Words; and the Yoakam formula is perhaps 
the Most realistic of all the formulas for use 
With testing instruments. The idea that dif- 


The grade level scores obtained for each 
test from the five formulas were averaged in 
order to obtain a mean grade level reading 
difficulty score for each test. These mean 
scores were taken ‘as criterion grade level 
scores of reading difficulty for these selected 
tests. They are shown in Table 1. 

A definite difference was noted in the re- 
sults of the measurement of the various tests 
by these five formulas as shown in Table 1, 
There was as much as 8.13 grades difference 
in the reading difficulty of a single test as 
determined by two different formulas. 

At the same time the five formulas corre- 
lated significantly with each other. The rank 
order correlations ranged from .91 between 
the Dale-Chall and Flesch formulas to .59 
between the Lewerenz and Yoakam formulas. 
These intercorrelations are shown in Table 2. 

The rank order correlations between each 
of the formulas and the mean grade level 
score ranged from .95 for the Dale-Chall for- 
mula to .77 for the Lewerenz formula as 
shown in Table 2. 

Correlations were also computed by means 
of the ratio of the estimated true variance to 
the observed variance between each formula 
and the means of the five formulas (5), 
These correlations as shown in Table 3 ranged 
from .90 between the Dale-Chall formula and 
the mean of the five formulas to .84 between 
the Flesch formula and the mean of the five 


tcult and easy words begin with certain let- las. However, the scores obtained from 
ers, pre d in the Lewerenz formula, does ae he reading level of 

o A sented in the dardized tests; and this study for the reading evel o each test 
he Ras apply to ae a word has, as correlated slightly over .95 with the mean of 
Dro mber of syllables + necessarily give the five formulas and ranged between 95 and 
ite he by Flesch, does no .72 for the five formulas as shown in Table 3. 

Index of difficulty. 
Table 2 


mulas Applied to Twenty-seven Tests and Correlation 
‘ormula 


Intercorrelations (Rho) for Be Pive hee and Mean of the Five Formulas 
(Rho) Between * 
SS . Mean of 
Dale- Flesch Lorge Lewerenz Yoakam Five 
Formula Chall 1 90 65 AS 95 
Dale-Chall = ‘st 66 «66 90 
Flesch -60 .69 .89 
Lor, 39 -T7 
ge 
Lewerenz -89 
oakam = 
ean of Five 


188 


F. W. Forbes and W. C. Cottle 


Table 3 


Correlations Between Forbes’ Formula, the Five Formulas Applied to Twenty-seven Tests 
and the Mean of the Five Formulas Using Ratio of Estimated 
True Variance to Observed Variance (5) 


Dale- Mean of 
Formula Chall Flesch Lorge Lewerenz Yoakam Five 
Forbes 95 83 84 72 90 96 
Mean of Five -90 84 -87 á .86 87 


The knowledge of grammar needed to ap- 
ply most of these five formulas studied is con- 
siderable. Also, the amount of time required 
by these methods makes them quite laborious. 
More than ten hours were required to apply 
some formulas to a single test. The average 
amount of time for the working of a single 
formula on a single test was more than two 
and one half hours. The simplified Forbes 
method, in contrast, requires only approxi- 
mately one half hour per test. 

Word difficulty was used as a common fac- 
tor in all five formulas studied. It was also 
evident from a review of the literature that 
word difficulty was basic to the readability of 
all printed matter. 


Development of the Forbes Method 


The following steps were taken in devel- 
oping the Forbes method which is specifi- 
cally suited for measuring the readability of 
printed matter in standardized tests: 

1. The five formulas studied were applied 
to each of the twenty-seven standardized 
tests. 

2. The mean grade level score of the five 
formulas for each test was taken as the cri- 
terion of readability for the tests. 

3. The vocabulary difficulty was deter- 
mined for each test 
of words above the most frequently used 
4,000 words in three samples of 100 words 


each selected at the beginning, middle and 
end of each test. k 


4. The Thorndike Junior Cen 
ary was used for finding the weights to be 
assigned to each word above the most com- 
monly used four thousand. The number fol- 
lowing the definitions in this dictionary is 
the weight for that word. The weights range 
from one to twenty, but since the first four 


tury Diction- 


by finding the number . 


thousand were dropped, only numbers of four 
and above were used. 

5. The total of these weights for each test 
was divided by the number of words in the 
samples, giving the index of vocabulary dif- 
ficulty. 

6. The standardized tests studied were 
placed in rank order as determined by the 
mean grade level scores of the five formulas. 
These tests were set off into grade groups- 
All tests falling within one half grade level 
above or below the grade were considered 
with that grade group. For example, grade 
level scores of 7.5 to 8.5 would be considered 
characteristic of the eighth grade reading ar 
ficulty. The largest and smallest indices © 
vocabulary difficulty falling within any one 
grade group were considered the limits for 
that grade. Table 4 gives the indices of V0- 
cabulary difficulty, setting the limits for the 
various grade levels. y 

7. The grade level scores derived from this 
method give the average reading grade leve 
required for the person taking the test ™ 


Table 4 


Grade Level of Reading Difficulty as Determined by 
the Index of Vocabulary Difficulty 


Index of Vocabulary Grade 
Difficulty Level 
1.4510 and above College 


1.2510-1.4509 


12th grade 
1.0510-1.2509 


11th grade 


-8510-1.0509 10th grade 
-6510- 8509 9th grade 
-4510- .6509 8th grade 
-2510- 4509 7th grade 
-0510- .2509 6th grade 
-0509 and below 5th grade 


a 


Determining Readability of Standardized Tests 


order that the test be understood and done 
Properly. 

The exact method of applying the new for- 
mula is listed as follows: 

I. Three samples of 100 words each were 
taken from the tests to be analyzed. The 
Samples were selected in each test at the be- 
Sining, middle, and end. The only require- 
ments for the samples were that they consist 
of an even hundred words, that each sample 

egin with the first word of an item, and the 
Vocabulary tests be omitted from the samples. 
Tt seemed only fair to omit the vocabulary 
Sections in order to get the average reading 
difficulty of the standardized tests. 
t seemed easiest to begin with the first 
Word of the first item of a test and count the 
tst hundred word sample exactly. The mid- 
€ sample was selected as near the midpoint 
Of the test as possible. Starting with the mid- 
le item count backward to the initial word 
of an item close to fifty words back. The re- 
mainder of the middle one hundred word sam- 
pte was secured by counting the difference 
tom one hundred in words beyond this mid- 
€ item. The third sample was taken by 
Counting backwards from the last word of 
the test items until one hundred words were 
Counted. Should the one hundred words end 
Within the item proceed counting backwards 
Until the first word of an item is reached, then 
™ order to get exactly the one hundred words 
mit the number over one hundred at the end 


of the sample. 
2. Each BE that appeared difficult to the 
Btader was written on a sheet of paper. ibe 
Words were then found in the 1942 Thorn a 
“nior Century Dictionary (11). The a 
Der following the definition in this a 
S the weight for that word. These num ie 
re from one to twenty, representing rds 
tees twenty successive thousands of ne 
Ost commonly used in the English aa ae 
aed words above the most frequently 


3 ight. 
ur t ji given a Weg 
were 
housand words of four or above 


NY word havi veight : 

ving a welg : ht 

aS considered a difficult word ano et 

the listed. Words used more ae each 

Samples were given their welg 
me they were used. 

The weights for the 


three samples were 


189 


totaled and divided by the number of words 
in the samples, 300 in this case. This gave 
the index of vocabulary difficulty for the 
standardized tests. 

4. Using the indices of vocabulary diffi- 
culty obtained from the above three steps, 
refer to Table 4 in order to determine grade 
level of difficulty of the printed matter in the 
test being analyzed. Grade level scores may 
be interpolated to the nearest tenth of a 
grade. 

5. The reading difficulty was also figured 
for the instructions for each of the tests ana- 
lyzed. The samples in some cases included 
all directions to the tests when they consisted 
of 300 words or less; other samplings fol- 
lowed the procedure outlined above for the 
test, that is, taking 100 word samples at three 
points throughout the instructions. 

There is little room for decisions to be 
made by the scorer who uses the Forbes 
method since the words are weighted in ac- 
cordance with an accepted word list. If a 
variant of a word or a hyphenated word does 
not appear in this list, no weight is given. 
Only words that appear in the Thorndike 
Junior Century Dictionary are given weights. 


Summary 


1. Review of the literature showed that no 
specific method has been developed for find- 
ing the reading diffculty of standardized tests 
(or public opinion polling questions or head- 
lines and slogans in advertisements) up to 
the present time. 

2. The five techniques for measuring the 
readability of printed matter that were ap- 
plied to the 27 standardized tests in this 
study showed wide variation as to the grade 
placement of the reading difficulty of these 
tests. 

3. The usual methods in use for determin- 
ing the readability of reading material con- 
sume a great amount of time for their appli- 
cation. 

4. These methods also required much inter- 
pretation and judgment on the part of the 
user, thus greatly lessening their objectivity. 

5. The peculiar make-up of the reading 
matter in standardized tests required that only 
the vocabulary difficulty factor be used for 


190 F. W. Forbes and W. C. Cottle 


determining their readability. The use of 
such factors as sentence length and preposi- 
tional phrases was not practical since many 
of the tests have sections composed only of 
word lists. 

6. The instructions to the standardized 
tests were easily within the range of reading 
difficulty of those for whom the tests were 
designed. 

7. The use of short word lists for determin- 
ing difficult words tended to give too coarse a 
classification of grade levels of reading. A 
longer list made the method for determining 
the readability of standardized tests more 
sensitive, spreading the grade level scores over 
a longer range. 

8. The method developed in this study was 
based entirely upon reading matter found in 
commonly used standardized tests. It is a 
technique applicable only to such reading 
matter or to similar material. 

9. The method evolved in this study is 
easily applied, consumes little time, and shows 
high objectivity by the elimination of most of 
the interpretations and judgments formerly 
left to the scorer. 


Received August 4, 1952. 


References 


1, Baker, G., and Peatman, J. G. Tests used in 
Veterans Administration advisement 


units. 
Amer. Psychol., 1947, 2, 99-102, 


2. Berkshire, J. R., Bugental, J. F. T., Cassens, 
F. P., and Edgerton, H. A. Test preference 
in guidance centers. Occupations, 1948, 26. 
337-343. 

3. Brophy, D. F., and Long, L. Veterans Adminis- 
tration vocational training program: Process- 
ing procedures used by the College of the City 
of New York. Psychol. Bull., 1944, 41, 795- 
802. 

4. Buros, O. The Third Mental Measurements 
Yearbook. New Brunswick: Rutgers Univer- 
sity Press, 1949. 

5. Cottle, W. C. Card versus booklet forms of the 
MMPI. J. appl. Psychol., 1950, 34, 255-259. 

6. Darley, J. G., and Marquis, D. G. Veterans 
guidance centers: A survey of their problems 
and activities. J. clin. Psychol., 1946, 2, 109- 
116. 

- Johnson, R. H., and Bond, G. L. Reading case 
of commonly used tests. J. appl. Psychol. 
1950, 34, 319-324. 

8. Klare, G. R. Evaluation of quantitative indices 

of comprehensibility in written communica- 


tion. Unpublished Ph.D, Thesis, University of 
Minnesota, 1950. 


~ 


9. Roeber, E. C. A comparison of seven interest 
inventories with respect to word usage- 
educ. Res., 1948, 42, 8-17. 

10. Stefflre, B. The reading difficulty of interest in- 
ventories. Occupations, 1947, 26, 95-96. 

11. Thorndike, E.L. Thorndike Junior Century Dic- 
tionary, Revised Edition, New York: Scotts 
Foresman and Company, 1942. 

12 


2. Vogel, M., and Washburne, C. An objective 
method of determining the grade placement 
of children’s reading material, Elem. Sch. Jo 
1928, 28, 373-381. 


a. 


ine Journat or Appiirp Psycnonocy 
ol. 37, No. 3, 1953 


A Modified Administration Procedure for the O’Connor Finger 
Dexterity Test 


Edwin A. Fleishman 


USAF Air Training Command, Human Resources Research Center * 


The O’Connor Finger Dexterity Test (8) 
has been widely used in counseling and selec- 
tion.’ It appears to measure dexterity of a 
finer type than is measured by the Minnesota 
Rate of Manipulation Tests (11, 9), or by 
Most of the subtests of the Purdue Pegboard 
(12). The test seems most useful for manual 
Jobs requiring rapid wrist and finger move- 
Ments, in fine assembly work requiring both 
Speed and precision, and in jobs involving 
io Manipulation of small objects. The 

alidity of the test has on occasion been dem- 
onstrated for electrical fixture assemblers (14, 

), radio assemblers (16), power-sewing ma- 
chine operators (10), watch assemblers (1, 

13 punch press operators (7), can packers 
(13), and dental students (5, 6)- 

“Yespite its widespread use, the test has two 
Primary difficulties as a selection device. 
wee relative to other dexterity tests (egs 

Mnesota Rate of Manipulation, Purdue Peg- 
aedy, the test takes considerably longer to 
*cminister. The time required generally var- 
es from 8 to 15 minutes. Moreover, during 
$ f more lengthy time period, the test yields 
a Y one score, whereas in a considerably 

Orter administration time the Purdue Peg- 

oard yields five scores (right, left, both 
ands, “total of these, assembly), and the 
Minnesota Rate of Manipulation Test yields 
east two scores (placing and turning). è 
A second limitation of the O'Connor tes 


~ Selection purposes is that it is @ w mg 
St. The examinee’s score is the totai n d 
fill the board. 


e é 
" of seconds it takes him to 


* 
tory. Steeptual and Motor Skills 
The’ g-2ckland Air Force Base, 
bart oft? reported in this stu 
sear the United States Air Fo 
5 ch and Development Programi 
©nelusions contained in ne ee 
or. They are not to be ce 

Ne views or indorsement of the Depar 

Wi Force. O'Connor 
S paper is not concerne ed con- 
eet Dexterity Test which has 
‘able study á 


d with the € 
also receiv 


191 


This procedure makes it difficult for a single 
examiner to administer the test to more than 
one subject at a time. 

The present study investigated the feasi- 
bility of certain modified administration pro- 
cedures which would decrease the total time 
required to give the test and which would ren- 
der the test more suitable for group admin- 


istration.* 
Procedure 


The O'Connor Finger Dexterity Test was 
administered under time limit conditions to 
unselected samples of basic airmen at Lack- 
land Air Force Base. The mean age of the 
subjects was 18.9 with a standard deviation 
of 1.3. The test was administered to inde- 
pendent samples of 100 subjects each. One 
group received the test for a four-minute time 
limit condition, another group for a five min- 
ute period, and a third group for a six-minute 
period. Within each sample the tests were re- 
administered for test-retest reliabilities. The 
interval between test and retest was held con- 
stant at one and one-half hours for each group, 
since it was assumed that the length of the 
interval might affect the magnitude of the re- 
liability coefficients. Under these time limit 
conditions, a subject’s score was the total 
number of pins placed during the allotted 
time.” 

In another sample, 100 subjects were tested 
and retested one and one-half hours later un- 
der the standard work limit conditions in 
which the total number of seconds required 


2 There appears to be very little published evidence 
indicating whether or not time limit and work limit 
methods of administering speed tests are equivalent 
and interchangeable. In one of the few previous 
studies on this problem, Paterson and Tinker (11a), 
working with speed of reading tests, found the work 
limit method to agree with the time limit method as 
closely as each method agreed with itself. 

3 Since 3 pins are placed in each hole the subject’s 
score is three times the number of holes filled up to 
the last hole plus the number of pins in the last hole, 
There are 100 holes in the total board. 


Edwin A. Fleishman 


Table 1 


Means, Standard Deviations, and Reliabilities for Five Administration Conditions! 


Test Retest 
Testing Method Score M S.D. M S.D: Reliability 
Work limit, full board Seconds 556 62.8 502 53.3 .86 
Work limit, half board Seconds 277 34.8 256 27.0 82 
Time limit—6 minutes Number of pins 205 19.9 221 23.9 -80 
(360 seconds) 
Time limit—5 minutes Number of pins 166 19.9 178 19.4 -16 
(300 seconds) 
Time limit—4 minutes Number of pins 137 14.9 147 14.1 a 


(240 seconds) 


‘Data for the Work Limit tests are based on the same sample of 100 Airmen. Data for each Time Limit 


test are based on separate samples of 100 each. 


to fill the entire board was recorded. Also 
recorded during these administrations was the 
time required to fill half the board. In an 
additional sample, 100 subjects were given the 
test under work limit conditions and were re- 
tested one and one-half hours later under the 
five minute time limit condition. 

In all, 500 subjects were involved in the 
study. Independent groups were used in each 
phase in order to duplicate the standard test- 
ing conditions. In this way scores and re- 
liability coefficients derived from each ad- 
ministration procedure could be more readily 
compared, uncomplicated by differential prac- 
tice effects from other forms of the test. 


Results 


Table 1 presents the means, standard devi- 
ations and test-retest reliabilities for the vari- 
ous administration procedures. 

It should be noted that these reliability co- 
efficients are to be regarded as conservative 
relative to split-half or immediate retest relia- 
bility estimates often reported. For example, 
Darley (4) has reported a corrected split-half 
reliability of .90 and Blum (1) reported a 
test-retest reliability of .89 for the standard 
test with a half-hour interval between admin- 
istrations. This latter reliability compares 


4 The original method of scoring the test involved 
a small correction in the second half of the test for 
practice on the first half. However, Tiffin and 
Greenley (16) found a correlation of .99 between 
the total time score and scores obtained by the origi- 
nal formula. More recent studies, including those of 
the USES, have used the simpler total time score. 


favorably with our test-retest reliability of .86 
following a longer (one and one-half hours) 
interval. He also reports a reliability of -82 
for the half length test which is identical with 
our results. All these coefficients are higher 
than the original test-retest reliability of 60 
reported by Hines and O’Connor (8) in their 
original standardization of the test. j 

The correlations obtained between the time 
limit procedure (five minutes), and the full 
board and half board work limit procedures 
were .96 and .89, respectively, after correc- 
tion for attenuation. This gives some indica- 
tion that the abilities measured by the time 
limit and work limit forms of the test are 
the same. 

It can be seen in Table 1 that there is some 
loss in reliability when the test is adminis- 
tered as a time limit test. In order to achieve 
comparable reliability under time limit condi- 
tions, to that obtained in the full work limit 
test (.86), nine minutes testing time woul 
probably be required.® However, the reliabil- 
ity achieved in the six-minute trial is probably 
sufficient for group prediction purposes. 
one is using the test as part of a larger selec- 
tion battery, one of the shorter tests probably 
has sufficient reliability for inclusion, since 2 
reduction in reliability of this magnitud 4 
would have little effect on the composit? 
validity of the battery (see 2, 15). The faut 
minute test might well be used where a cho! 


5 i A e 
Estimated from the six-minute test by the Spe 
man Brown Prophecy formula. 


Modified Procedure for O’Connor Finger Dexterity Test 


193 


Table 2 


Normative Data for Five Administration Conditions of the O2Connor Finger Dexterity Test! 


Raw Scores 


Conditions 
Work Limit Time Limit 
(Seconds) (No. of pins placed) 
Percentile Full Board Half Board 6 Min, 5 Min. 4 Min. 

100 372 213 270 264 180 
99 430 217 245 219 173 

98 442 221 242 214 169 
96 468 226 236 208 165 

90 481 233 228 196 157 

75 507 254 216 185 146 
63 526 262 210 178 139 

50 548 271 204 170 135 

37 372 278 198 166 131 

25 600 299 189 157 127 

14 632 317 182 147 124 

10 647 323 178 142 122 

6 674 346 174 136 116 

2 717 379 168 119 113 

1 746 384 140 115 101 

LN, ple of 200 Airmen. Norms for each Time Limit test 


are 
based on separate samples of 100 each. 


ne t be made (as is often necessary) between 
So’ uding a longer form of the test or adding 
the additional type of test which broadens 
ie Scope of abilities sampled by the battery 


titte time allowed. For individual predic- 


io A dit: 
li n and guidance purposes, the standard work 


ue Procedure is probably desirable. 

m able 2 summarizes some preliminary nor- 
ative data for the various administration 

Procedures. 

itg though these results are bası ee 

cif, Samples and are to be regarde as p 

Sty to this kind of population, they z 
o as a suggestive guide in future ag 

to hee under these conditions. It is a z 

© noted that the work limit scores pr 


ns rm- 
ted are generally higher (poore p at 


ed on lim- 


an 
in than those usually reporte 
Pulations, 
Summary 
i: rity Test was 


seat O'Connor Finger Dexte ce weil 
Sh istered under various work oe re 
time limit conditions. The 


Norms for the Work Limit tests are based on a sam 


dicate that although there is some loss in re- 
liability under the time limit conditions, the 
reliabilities are probably adequate for group 
prediction, especially if the test is to be in- 
cluded in a larger battery. Preliminary norms 
for the modified administration conditions 
were presented. 


Received August 12, 1952. 


References 


1. Blum, M. L. A contribution to manual aptitude 
measurement in industry. J. appl. Psychol. 
1940, 24, 381-416. > 

2. Brokaw, L. D. Comparative validities of “short” 
versus “long” tests. J. appl. Psychol, 1951, 
35, 325-330. 

3. Candee, B., and Blum, M. L. Report of a study 
done in a watch factory. J. appl. Psychol. 
1937, 21, 572-582. > 

4. Darley, J. G. Reliability of tests in the stand- 
ard battery (in “Research studies in individual 
diagnosis”), Univ. of Minnesota, Bull. Empl 
Stab. Res. Inst., No. 4, 1934. e 

5. Douglass, H. R, and McCullough, C. M. Pre- 
diction of success in the School of Dentistry 
Univ. Minn. Stud. Predict. School Arch., isa 


2, 61-74. 


194 Edwin A. 


6. Harris, A. J. Relative significance of measures 
of mechanical aptitude, intelligence, and previ- 
ous scholarship for predicting achievement in 
dental school. J. appl. Psychol, 1937, 21, 
513-521. 

7. Hayes, E. G. Selecting women for shop work. 
Personnel J., 1932, 11, 69-85. 

8. Hines, M., and O'Connor, J. A measure of 
finger dexterity. J. Personnel Res., 1926, 4, 
379-382. 

9. Jurgensen, C. E. Extension of the Minnesota 
Rate of Manipulation Test. J. appl. Psychol., 
1943, 27, 164-169. 

10. Otis, J. L. Prediction of success in power sew- 
ing machine operating. J. appl. Psychol., 
1938, 22, 350-366. 

11. Paterson, D. G., and Darley, J.G. Men, women, 
and jobs. Minneapolis: Univ. of Minnesota 
Press, 1936. 


Fleishman 


Ti 


12. 


Da 


a. Paterson, D. G., and Tinker, M. A. Time-limit 
and work-limit methods. Amer. J. Psychol, 
1930, 42, 101-104. 

- Purdue Pegboard (Examiner Manual). Science 
Research Associates, 228 South Wabash, Chi- 
cago, Illinois. 

. Stead, W. H., and Shartle, C. L. Occupational 
counseling techniques. New York: American 
Book Co., 1940. 

- Steel, M., Balinsky, B., and Lang, H. A study 
on the use of a work sample. J. appl. Psy- 
chol., 1945, 29, 14-21. 

- Thorndike, R. L. Personnel 
York: John Wiley, 1949. 

. Tiffin, Ją, and Greenly, R. J. Employee selec- 
tion tests for electrical fixture assemblers and 
radio assemblers. J. appl. Psychol, 1939, 23: 
240-263, 


New 


Selection. 


THE Journ i 7 
Vol. oia oF Aner Psycnonocy 


A Comparison of the Revised Allport-Vernon Scale of Values 
(1951) and the Kuder Preference Record (Personal) 


Ira Iscoe and Omer Lucier 


University of Texas 


ie pas Allport Vernon Scale and the 
sepang aeference Record (Personal) yield 
defin T trait” scores which are named and 
report - The purpose of the research herein 
the a was to examine the communality of 
tate trait scores on the two scales. 
Specti ang to the definitions given in the re- 
that ive manuals (1, 3), it would be expected 
erie high positive correlation would exist 
of eens (1) the Theoretical “trait” scores 
Sean instruments; (2) the Economic 
of th Scores of the Allport and the Practical 
AM’, Kuder; and (3) the Political of the 
of Port and the Sociable and the Dominant 
the Kuder, 
foe Would be also expected that a high nega- 
th Correlation would exist between the Aes- 
etic of the Allport and the Theoretical of 
Kuder, 


Subjects 


Ji e" c 
Ninety adult males, the majority of them 
Wersity of Texas students, acted as sub- 


jects. The mean age of the group was 26 
years with an S.D. of 6.5 years. The average 
number of years of education was 14.6 with 
an S.D. of 2.4. The mean scores and S.D. 
made by the experimental groups were not 
significantly different from the scores made 
by comparable groups used by Allport and 
Kuder in standardizing their tests. 


Procedure 


The tests were administered in accordance 
with the instructions in the respective manu- 
als. The Allport-Vernon was taken first, fol- 
lowed by the Kuder. If the subjects did not 
have time to complete the Kuder during the 
scheduled sessions it was taken home and re- 
turned later. Scattergrams were made for the 
numerous combinations of item scores for one 
inventory with item scores of the other in- 
ventory. Since a rectilinear relationship was 
obtained for all scattergrams the use of the 
Pearson product-moment formula for correla- 
tion was justified. The evaluation of the data 


Table 1 


Correlations Betw' 


een Each Score of the Kuder P 
of the Allport-Vernon Sc: 


reference Record (Personal) 
ale of Values, 1951 


and Each Score 
se e Kuder 
iable Practical Theoretical Agreeable Dominant 
Allport n On g6 p= 86 r= 85 y= 84 om es 
r 
04 «20 —08 i 
Coretica] —.30 : = i 

Economie can .00 -09 a 2 13 

“sthetic (29 .01 -23 , : —.08 

Socia] (90) 10 05 <17 10 —08 

litical (77) 02 33 = — 32 30 

Religio, ee 13 27 =e 13 —.16 

(.90 

a a in pa heses immediately followi 
* : anuals, and are placed in parent ely following 
the i he score reliabilities are from the Sea RET h the Kuder designation. at Tenuil contains relia- 
pity port designation and immediately Richardson Formula. aliport scores were Shel Meee - was 

at fo. Gtsures c ted by The Pucer- ual for the 4 1 -Retest 
ta ier 100 men.” The reliabilities m oe een the test and the retest. d fon rot th also includes a table 
OF 6k 34 cases wi nth intervening Or tical material on the revised form ot this inventory is as yet 
ases with one mol 0. Statistica to the authors, “ the present revision offers nee 


lit. J 5 E 
scarey Half reliabilities with an N of Fon (1951). A 


in © a F 
Apro, Ue to the recency of its P changing the = 
1951 version 


V 
Study “ments without in any W3 
°F Values’ as compared to the 


e 
E its scope of usefulness” (1, p. 6). 


196 


by means of factor analysis was not resorted 
to in view of Guilford’s (2) recent article on 
“Tpsative” factors and his remarks that scales 
such as the Kuder were not amenable to fac- 
tor analysis. A total of 30 correlations (six 
traits of the Allport and five for the Kuder) 
were computed. 
Results 


It can be seen from Table 1 that none of 
the hypotheses put forth at the beginning of 
this article were justified. The correlation of 
.20 between the two theoretical scales is sur- 
prisingly low. Indeed, the highest positive 
correlation obtained (.47) was between the 
aesthetic of the Allport and the theoretical of 
the Kuder—where the expectancy was for a 
high negative correlation. The low positive 
correlation between the “Social” of the All- 
port and the “Sociable” of the Kuder can be 
explained in that other than having similar 
names, they are defined rather differently. 


Conclusions 


The results obtained point up once again 
the dangers of using similarly defined traits 
measured by different instruments. As an ex- 
ample, one of our subjects obtained the fol- 


lowing raw scores on the “Theoretical” of 
both instruments: 


Ira Iscoe and Omer Lucier 


Percentile 
Instruments Raw Score Rank 
Allport 48 73 
Kuder (Personal) 25 13 


It can be seen that on one instrument he 
would be considered of reasonably high theo- 
retical orientation while on the other he woul 

be very low. Since both the Allport and the 
Kuder are used in educational and vocational 
counseling a totally different picture of this 
subject’s interest would have been furnished. 
The importance of knowing the relationships 
between the various measuring instruments 15 
perhaps one way of avoiding gross errors 1E 
the counseling situation. One avenue 0 
further research might be the use of tw? 
instruments on a population where ce 
traits mentioned were believed to be prese? 
to a high degree. 


Received July 17, 1952. 


References 


1. Allport, G. W., Vernon, P. E., and Lindzey» i 
Manual to the Study of Values. New Sores 
Houghton Mifflin Co., 1951. ze. 

2. Guilford, J. P. When not to factor analy 
Psychol. Bull., 1952, 49, 26-27. „nce 

3. Kuder, G. F. Manual for. the Kuder Prejērene 
Record (Personal). Chicago: Science 
search Associates, 1949, Revised Sept. 1949. 


s 


THE Journat 
NAL OF APPLIED Psy š 
Vol. 37, No. 3, 1953 ED PsyCHOLOGY 


Administering Form BB of the Kuder Preference Record, Half 
Length 


A. A. Ca 


nfield * 


Wayne University 


BS goes ps Form BB of the Kuder Pref- 
in id ecord ? to an entire group of people 
hour Sid usually requires more than an 
ings et, many finish in less time. Feel- 
ey rae boredom, and annoyance are 
ing ea expressed by the examinees die 
regul course of the test. Sighs of relief are 
com i: expressions for employed adults who 
ap the record. The laborious and fre- 
with rA painful task of punching the holes 
Progr e small pin provided, the turning of 

A essively smaller and smaller pages, and 
o apparent duplication of items from pase 

Page combine to give an emotional reaction 
le in the main, unpleasant. Examines 
“che “ne if some scoring technique 1S used to 
ears -up” on the consistency of their an- 
he 5 by comparing their answers On, what 
difi believe to be the same item occurring on 

erent pages. 
a objections raised by the examinee are 
Scien ways easy to turn away with good a 
tined” for the specific percentile scores ssf 
and eg normally sorted into quite ct 
ion Otten arbitrary categories for sat a 
the ` The manual? accompanying the tes in 
Scor Section devoted to the interpretation g: 
Cat eS recommends that three general scor 

‘egories be used in interpreting the results 
nie for percentiles above 75, low i p 

tiles below 25, and average for the mi ue 
o ae Some test users have expande n 
arr Clude five groupings. Tt can be 1 y im 

assing to try to explain to an exa 

he 


iQ 2 
fitm he author wishes to express his Mi- 


eor; ciates m 
ge Fry and Asso Pie for 


plies for 
esley 


2 an 
Bg Kuder, 

$ icago: Science Researe h 
Prefopet, G. F. Revised Manual for 


ii Researc 
Sociatoce Record. Chicago: Science 
€S, 1946, 


the need for the length of the test, if one ad- 
mits to this type of interpretation, an inter- 
pretation which many users of the test make. 
In addition to these broad classifications, the 
manual contains sample profiles for 51 occu- 
pations. The manual cites the desirability of 
collecting more data for the purpose of pre- 
paring occupational profiles of greater relia- 
bility, but does not recommend their use in 
counseling or guidance work. 

In many cases a measure of interests, such 
as this test provides, would be a useful ad- 
junct to the information supplied by other 
tests but the testing time required makes it 
impractical. Miles * has recognized the prob- 
lem and suggested using the scores obtained 
on pages 7, 8, and 9 of the record and then 
multiplying them by constants to predict the 
total score on each of the nine interest areas. 
Using this method on a sample of 205 adult 
rrelations were obtained between the 
predicted and the actual scores ranging from 
‘76 on Part II (Scientific) to .91 on Parts 
II and VI (Computational and Literary). 

The present study was undertaken because 
it was considered desirable to elaborate this 
ratio approach by making a correlational 
analysis and developing regression equations 
for a more accurate prediction of the total 

An examination of the answer sheet 
d that the division of pages in the book- 
t comes the closest to giving an even 
tem responses for the nine interest 
areas was that of the odd-numbered pages 
versus the even-numbered pages. Since this 
division also supplied an odd-even reliability 
grouping, it was decided to undertake the 
study using this page division. 

Method 
of 301 completed records, repre- 
senting a substantial proportion of the per- 
ra _ W. A pro} osed short fo: 
i deen Reon J. appl. Peyrol, tone 
32, 282-285. 


N 


males, co 


score. 
showe 
Jet tha 
division of i 


A total 


198 


A. A, Canfield 


Table 1 
Correlations and Regression Equations Obtained in the First Sample 
N = 301 
Odd Pages Odd-Even Even Pages 
Interest Area ret Regression Eq. T Toe re. Regression Eq. 
" 9.10 
1. Mechanical 96 1.38x + 20.18 92.96 89 1.91x + j a 
2. Computational 92 1.85x+ 3.30 17° (87 .93 1.68% + ias 
3. Scientific 92 1,82x + 3.95 78 88 93 168x + a 
4. Persuasive -94 -1.81x+ 7.60 84 91 94 1.83x + 593 
5. Artistic 92 1.65x+ 1.02 78 88 95 1.99x + 731 
6. Literary 92 2.01x+ 8.28 81.90 92 1.60% + por 
7. Musical 88 183x + 2.20 .78 88 OL Lids + aot 
8. Social Service 93 1.80x + 5.36 .76 86 92 1.67x + an 
9. Clerical .97 210x-+ 6.29 69 82 .91 1.384 9: 


sons employed in supervisory, staff and ad- 
visory, and skilled line positions by a large 
midwestern canning company, were pulled 
from the files. Each paper was then scored 
by interest area, the score on each area being 
divided into that achieved on the odd-num- 
bered pages and that obtained on the even- 
numbered pages of the booklet. Inasmuch 
as these scores, as well as the totals in some 
cases, were noticeably skewed, all of the dis- 
tributions were normalized by the percentage 
method. Correlations were then computed be- 
tween each of these scores and the other two 
for each interest area. The computation of 
the mean and standard deviation of each dis- 
tribution supplied the additional data neces- 
sary to develop the regression equations de- 
sired for predicting the total scores from the 
scores on either of these two halves, 

To check upon the accurac 
tions the completed records 
ple of 100 employed males, drawn alpha- 
betically from the files, were scored in the 
same manner. The score in each interest area 
was broken down into that obtained on the 
odd-numbered pages and that achieved on 
the even-numbered pages. None of the papers 
used in the first sample were included in this 
second group. Correlations were then com- 
puted between the predicted scores and the 
obtained scores for each of the nine interest 
areas, and the standard errors of estimate 
obtained. As a check upon the representa- 


y of these equa- 
of a second sam- 


tiveness of the two samples used in this 
search, the means and standard deviations © 
the total scores in each of the interest areas 
were also computed. 


Results 


The original correlations between the sore 
on the even-numbered pages and the tole 
scores, and the resulting regression egnatiors 
are presented in the first two columns O 
Table 1. The correlations between the gores 
on the odd-numbered and even-numbere 
Pages for each of the nine interest areas ate 
shown in the third column. Inasmuch as they 
represent odd-even reliability figures, the 
corrected values, using the Spearman-Browh 
prophecy formula, are shown in the adjacen 
column. These values are almost identica 
with those given in the manual for reliabili- 
ties computed using the Kuder-Richardso” 
method. The two right hand columns of Table 
1 show the correlations between the scores 0" 
the even Pages and the total scores, and tne 
resulting regression equations. These regres 
sion equations were then used for predicting 
the total scores as previously described. 3 

The correlations obtained between the w 
dicted total scores, based on the odd-pase 
Scores, and the obtained total scores for tP 
Second sample, along with the mean errors 
Prediction and the standard errors of esti 
are shown in the first three columns of Tab 5 
2. The same information for the predictio™ 


Administering Form BB oj Kuder Preference “Record 199 
Table 2 
Correlations Between Predicted and Obtained Scores, Mean Errors, and Standard Errors 
of Estimate Obtained in the Second Sample 
N = 100 
Odd Pages Even Pages 
I 
nterest Area ryy Meror S.Evest Tyy! Merror S. Biz 
5 Mechanical 97 1.32 4.94 93 21 7.54 
as omputational 94 —.95 4.06 93 10 4.17 
z S 92 49 5.51 94 — AS 5.08 
i Crsuasive 07 as SH 5.39 .96 —.40 5.87 
5 Artistic 95 —.85 4.27 .94 44 4.63 
r Literary 92 95 4.96 -96 —.65 3.73 
s Musical 90 16 3.16 93 —.22 2.70 
s Social Service ‘OL 16 6.18 92 —.32 5.79 
» Clerical 92 =| 5.39 96 —.03 3.99 


ee on even-page scores are given in the last 
© columns of Table 2. 
PA be noted that the corela ions = 
tained e predicted total scores and the 09- 
total scores range from .90 to 97. 
consigo dard errors of measurement are pn 
ean ering the percentile equivalents, and the 
errors similarly small.” 
able 3 shows the means 


eviati 
ng of the two groups 
Yy and the means and standar 


used in this 
d deviations 


and standard 


of the norm group reported in the manual. 
The means and standard deviations of the 
three groups are generally similar, with the 
exception of the generally higher interests of 
the experimental groups in the persuasive 
area. 


Summary 
This study was designed to determine the 


plausibility of administering Form BB of the 
Kuder Preference Record half length. An 


5 y É : 
mak Aversion tables have been prepared | which analysis of the test answer sheet indicated 
i At possi anslate art-score directly 
Fe the eS tramal ai uF the nine interest that the odd ey’ ce the even pages of the 
cir: copy of these conversion tables can be se- test containe a fairly even distribution of 
Schon tom the Department of Personnel Methods, items in each of the nine interest areas meas- 
Michigan at maes, Wayne University, Detroit 1 ured by the test. 

3 cost. 
Table 3 
Means and Standard Deviations of the Two Sample Groups and the Test Norm Group 
i Verification Study Norm & 
ly y rm Grow 
rg cot ie Sap 

Mterest Area M S.D. M oD. M S.D. 

‘ Mechanical 81.3 18.8 PR e 78.6 22.8 
3, go MPutational 36.8 11.4 35. u 35.3 10.6 
4 Cientific . 63.2 14.8 61.8 5 64.0 15.5 
* Petsuasiy, 6.6 20.9 92.4 21.0 74.4 20.6 
S Artistic o 40.6 14.0 46.1 13 
g Attistic 44.7 12.9 12.7 e 
7 viterary 198 143 50.0 A 47.8 15.1 
g. llsical 16.1 7.8 A jsd F 9.6 
s Socia] Servi 73.6 16.8 5 el ut 17.5 
* Mlericay ee 48.0 13.3 47.8 é 52.1 13.5 


200 


The completed answer sheets of 301 em- 
ployed males, representing a variety of jobs, 
were analyzed and mathematical equations 
for predicting total scores from the scores on 

' the two different sets of pages were developed. 
A second sample of 100 employed males was 
used to test the accuracy of predictions using 
these equations. 

The results indicate that the test could be 
administered half length with little loss of 
accuracy under normal conditions of test in- 
terpretation. This reduction in administra- 


A. A. Canfield 


tion time means an appreciable reduction in 
testing costs, greatly lessened feelings of fa- 
tigue, boredom, and irritation for the exami- 
nee, greater possibilities for using the test in 
industrial situations where its administration 
time has previously been considered prohibi- 
tive, and an opportunity to use the time 
saved for the administration of other tests 
that might contribute to the prediction of 
job success. 


Received June 30, 1952. 


ae Journar OF APPLIED PSYCHOLOGY 
ol. 37, No. 3, 1953 


Attitudes Toward Public Low-Rent Housing, Before and After 
Construction * 


Kenneth E. Clark 


University of Minnesota 


and 


Charles E. 


Swanson 


Institute of Communications Research, University of Illinois 


Two years prior to the collection of the 
data reported herein, there was announced 
a midwestern city a plan for the erection 
of a public housing project, using federal 
unds, to provide living facilities for persons 
Of low income. Since this project was to be 
built in an area fairly well surrounded by ex- 
‘Sting housing, it was considered that the sur- 
Vey of attitudes of persons in the neighbor- 
apod both before and after construction of 
ive development would provide significant 

Ormation on the dynamics of attitude 
Changes. The results of the original survey 
fore construction have already been re- 
Ported in this journal." , 
Chin two surveys were made under fairly 
‘Marable conditions. The first was made 
D Une 1950, shortly after announcement 0 
A řoval of the project. The second, Bere 
ne Made just two years later, emerge 
abo» 2! after construction had started, ane 

Out two weeks before the first families be 


a move into the pale e n 
i, fixed-addres: 
phere d, inter- 


irector its liste! 
vi y sample; of 196 uni i 
mS with a responsible adult were ani 
thie? Or 96 per cent. In the second aa 
Same list of addresses Was used, i at 
de, £d by an additional sample Se ob- 
A total list of 388 addres 17 had 


» of which 366 were usable 
roneous; 

; i down, 4 addresses were er A 
r, Norris Ellertson 


Ouse was vacant). Of these 366, 
THE ya 
Je writ ; to M.: r mee 
his Work in pare of the inter the 
in this study, and for his w de possible 
results, This study W'S." a] Research, 
toje? SUpport of the Office of Neve 
wt Che Néomr-246, T.O. IV, NR 13- ei 
Cacti tk, K, E., and Swanson, C. 
Choy, R to public Jow-rent housins- 
? 1951, 35, 342-347. 


y the of 


households, or 96 per cent, were contacted 
and interviewed. Six householders refused 
to be interviewed (less than 2 per cent); 2 
would not answer the door; 2 were not at 
home after 2 call-backs; 2 were out of town; 
2 returns were not usable; 1 was unable to 
help because of death in the family. 

The same questionnaire used prior to con- 
struction was used after construction, with 
only minor changes (“how many stories will 
.. 2 was changed to “how many stories does 
...”), A new question was added to permit 
sorting the householders into those present 
in the community at the time of the first sur- 


vey and those not present. 


Results 


Opposition to the housing project decreased 
somewhat over the two year period. Re- 
sponses to the questions “Do you favor or 
oppose the construction of this new develop- 
ment?” and “How strongly do you feel about 
this?” for the two years, 1950 and 1952, are 
presented in Table 1. This increase in favor 
occurs without a reduction in the number of 
no opinion responses. This is a rather sur- 

rising result, since one might expect that, 
with the project in actual physical existence, 
more persons would have formulated an opin- 
jon of some sort. The physical appearance 
of these new units in general attracted favor- 
able comment, which may account in part for 
the shift in attitude, but certainly not for 
the c ontinued high percentage of persons re- 


fusing to state a position. 
The same question asked about income in 


1950 was asked again in 1952, except that an 
additional $500 was added to each response 
category as a rough estimate of the average 
increment in income which might have been 


202 Kenneth E. Clark and Charles E. Swanson 
Table 1 
Opinions Toward Low Rent Housing Project in 1950 and in 1952 
No Opinion 
Favor Oppose or Qualified 
1950 1952 1950 1952 1950 1952 
Total Group: Number 73 159 71 108 44 84 
Per Cent 39 45 38 31 23 24 
By Intensity of Feeling: E 
Very strongly 42% 46% 49%, 53% 0% 1% 
Rather strongly ; 33 35 39 33 0 4 
Not strongly at all 24 17 12 13 39 21 
No answer 1 2 0 1 61 74 
Total 100% 100% 100% 100% 100% 100% 


expected during this period. These two dis- 
tributions are presented in Table 2. 

That our estimate of the average increment 
was not too far off is indicated by the simi- 
larity of the two distributions. 

Table 3 shows that in 1950 a slightly larger 
proportion of persons with high incomes 


tended to favor the project than did those 
with lower incomes. This was also true in 
1952. The 1952 opinions are generally more 
favorable for all three groups; the lowest in- 
come group, however, has a much larger per- 
centage of undecided respondents in 1952 
than it had in 1950. Why this should be iS 


Table 2 


Reported Incomes of Respondents in 1950 


and in 1952 


N Per Cent 


a: = 2 
1950 1952 1950 1952 1950 1952 
$5,000 up $5,500 up 62 108 33 at 
4,000-4,999 4,500-5,499 38 54 20 16 
3,000-3,999 3,500-4,499 36 71 19 20 
2,000-2,999 2,500-3,499 30 53 16 15 
1,000-1,999 1,500-2,499 13 21 7 6 
0-999 0-1,499 2 1s i 4 
No answer 7 29 4 8 
Total 188 351 100% 100% 
iam 
Table 3 
Opinion on Housing According to Income Level in 1950 and in 1952 aa 
E alified of 
Income Level N Favor Oppose No Opinion 
1950 1952 1950 1952 1950 1952 1950 1952 1950 195 
$5,000 and up $5,500 and up 62 108 4 19% 
$3,000 to 4,999 $3,500 to 5,499 a mso a^ S% B% U% LA 199 
Less than $3,000 Less than $3,500 45 89 - a 


35 40 8 2 ā 3 


E, 


Attitudes Toward Public Low-Rent Housing 203 


Table 4 
Do You Think Property Values Will Go Up, Down, or Stay the Same? 
N Go Up Go Down Stay Same Other 
1950 1952 1950 1952 1950 1952 1950 1952 1950 1952 
5 Group 188 351 5% 4% 35% 28% 50% 60% 10% 8% 
"avor Project 733 159 8 5 7 11 77 77 8 7 
Oppose Project 71 108 3 3 75 60 16 30 7 8 
No Opinion or Qualified 
Opinion on Project 44 84 2 2 18 19 61 68 19 11 


not clear. It would seem that these persons A series of information questions was in- 
pecame more uncertain about the project as cluded in the original survey to determine the 
its reality increased. degree to which persons had become ac- 

Tables 4, 5, and 6 present responses to three quainted with specific portions of the plan 
question requiring prediction about the effects for the development. Responses in 1950 and 
of the Project divided according to responses 1952 are compared in Table 7. Item num- 
avoring or opposing the project. These re- bers refer to the following questions: 


Sults are . P ° ; th 

; b a è sting since they ii 

dicate eee A delion of an un- 1. “About how many families do you un- 

Pleasant sort by those who oppose the project. derstand will be housed in this develop- 

Us, those per ho still say they oppose ment?” Correct answer in 1950 was, 
Lapeer Gok “120”; in 1952, “184. 


he ject do so wi iated feelings 
ct do so with less associate #120” ; 
j w much a month will be charged for 
Pleasant consequences of the project. 2. “Ho h 


Table 5 


People into Neighborhood? 


Do You Think This Unit will Bring Undesirable 
N Yes No Other 
D 
1952 1950 1952 1950 1952 1950 1952 
1950 52 5 š 9 
Total Gr S gs 351 4% 28% a 15% 21% a% 
Cvor Project "i 159 15 : ; 4 e 2 z 
PPose Project 1 108 76 5 
Pinion or Qualified á r ž j is i 
Pinion on Project 44 8. 
Table 6 


ve Effect on Your Long Term Plans to Stay 
a 


j tH 5 
Will Construction of Dev ae t of Neighborhood? 
Ng or Move Ou 
= Yes No Other 
N zi 
ee 1950 1952 1950 1952 1950 1952 
1950 195 s an Th a iT 
peal Gr jes 351 28% ia 86 96 3 3 
OLYF Proje 73 «159 wt 4 3o 4 oOo æ 
Ject 7 55 
N Se Proj 71 108 
© Opinion 6 84 19 5 15 
Opini 2 OF Qualified u a 11 


on on Project 


204 Kenneth E. Clark and Charles E. Swanson 
Table 7 
Correct Responses to Information Questions in 1950 and 1952 
N Question 1 Question 2 Question 3 Question 4 Question 5 

1950 1952 1950 1952 1950 1952 1950 1952 1950 1952 1950 1952 
Total Group 188 351 30% 23% 12% 23% 15% 79% 27% 34% 24% a 
Favor Project 73 159 32 2 R 2 25 84 30 32 A S 
Oppose Project 71 108 32 29 13 26 13 84 3t 4 24 
No Opinion or Qualified x A 

Opinion on Project 44 84 23 19 9 14 2 64 16 29 27 3 


rent, heat and utilities for one of these 
units?” Correct answer in both years 
was “about $36 per month.” 

3. “How many stories do the units have?” 
Correct answer in both years was, “two.” 

4. “What is the most money a family can 
make a year and still rent a place in the 
development?” Correct answer in both 
years was, “not to exceed $2400 plus 
$100 per dependent.” 

5. “If an undesirable family gets into this 
development will the housing authority 
be able to get them out?” Correct 
answer in both years was, “yes.” 


The percentages reported in Table 7 are 
the percentages of correct answers. Only 
question 1 shows a decrease in correct in- 
formation, perhaps due to the change in cor- 
rect answer from 120 families in 1950 to 184 
families in 1952. The only item on which 
opponents of the project show less informa- 
tion than proponents is question 5. Question 
3 shows a remarkable improvement in in- 
formation, although it is rather surprising 
that even though this large and prominent 
project exists in their immediate neighbor- 
hood, 21% of the residents do not know that 
the buildings are two stories in height! 


Results from Matched Respondents 
The preceding analysis is of 
scribing the total change in 
the community, but yields littl 
about what happens to the 
spondent as a result of watchi 
develop from the planning t 
tion stage. Accordingly, 
addresses used in both 


interest in de- 
sentiment in 
e information 
individual re- 
ng this project 
o the construc- 
from the sample of 
the 1950 and 1952 


surveys, respondents were matched, as nearly 
as possible, using information on apparer 
age, sex, and reported education. A total 0 
171 of the original 188 households were in- 
cluded in the 1952 survey. Of these, 14 were 
newcomers; i.e., they reported that they vege 
not living in their present house in June © 
1950. Of the remaining 157, 66 persons 
were found with matching characteristics, 
and are assumed to have been interviewed in 
1950 and 1952. Their responses for these 
two surveys are shown in Table 8. 

These results are not in accord with oe 
for Table 1, where the percentage of quali 
fied and no opinion responses remained as 
high in 1952 as in 1950, and where, appa- 
ently, gains in favor were made at the €x- 
pense of the “oppose” group. These results, 
for matched respondents, indicate that acre 
in favor are made at the expense of the quali 
fied or no opinion groups. a obo 

Further information on this point is O 
tained by comparing the responses of matche 


Table 8 


Comparison of Responses of Same Persons 
in 1950 and 1952 


1952 

Qualified 

or No ta 

120 Oppose Opinion Favor Ta 
Favor 1 1 2 23 
Qualified or 4 
No Opinion 5 9 8 A 
Oppose 15 3 3 at 
ms aa aes 6 
ae 21 13 32 g 


e 


Attitudes Toward Public Low-Rent Housing f 205 


households, rather than matched individuals. 
These results are shown in Table 9. (Note 
that the following values include the persons 
already reported in Table 8.) 

The results in Table 9 have somewhat more 
meaning than the preceding ones, since they 
are based on larger N’s, but have lost some- 
what in their significance since they represent 
matched households rather than matched in- 
dividuals. If we may overlook the latter fac- 
tor, it seems clear that the most significant 
changes in response have occurred in the 
original opposition group. This group shows 
almost as much shift in opinion as the origi- 
nal no-opinion group, and shifts almost as 
much to favor as it does to no-opinion. 


Table 9 


Comparison of Responses of Same Households 
in 1950 and 1952 


1952 
Qualified 

or No rotal 

1950 Oppose Opinion Favor Total 
Favor 8 9 38 55 
Qualified or k 

No Opinion 12 16 15 

Oppose 32 15 12 5 
Total 52 40 65 157 


The total number of households shifting 
rom one position to another, shown an 4 
> Às larger than one might have predicted, 


“SDecially when the direction of shift Vir 


Sm Is it possible 
that goin tS it apparently omponses were bel 


At some of the original response the 
kg little intensity, and so were almest e 
on gês no-opinion responses? Some evide 
P this point is available in the ™ sponse 
Person sample in Table 8 from the 1h the 
oTe question on intensity wath n ‘he dis- 
tonion was held. One might expect ngers” 
to tion of responses in 50 fs otk the 
G © somewhat different from t the case- 
Alth changers.” But such 2 

Ough in this group t «inal re- 
Smal] (only 8 ad Eo with an origina 
Ponse of favor or oppose ™” 


sample changed their responses in 1952), the 
distributions on intensity are still so nearly 
identical as to suggest that this is not a likely 
explanation. What seems more likely is that 
the large number of “changers” occurs only 
when we interview different persons in the 
same household. This suggests “division of 
opinion” within households as well as between 
households. ; 

The findings of a study of this sort have 
considerable significance for those persons 
associated with planning civic programs, for, 
in working on plans, one must not only con- 
sider the opinions of one’s public at the time 
plans for change are announced, but also the 
eventual degree of acceptance or non-accept- 
ance of the completed project. In this par- 
ticular instance the effect of the actual con- 
struction of a low-rent housing project was 
to reduce, but only to a slight degree, the 
opposition of the neighboring residents to the 
project. 

Summary 


Shortly after the approval in 1950 of plans 
for a low-rent housing project in a metro- 
politan area, interviews with neighboring 
householders were conducted to determine 
their opinions about this project. Again in 
1952, shortly after completion of the con- 
struction of the project, but before the new 
occupants began to move in, interviews were 
conducted with the original sample of house- 
holds, and with an additional sample of about 
the same size. It was found that: 

1, About equal numbers favored and op- 
posed the project in 1950, with about one in 
four undecided. In 1952, about the same 
proportion continued to be undecided, but the 
proportion favoring the project had increased 
slightly (from 39 per cent to 45 per cent). 
The large number of respondents who con- 
tinue to be “undecided” in 1952 occurs in 
spite of the prominence of the project, the 
considerable publicity given to it, and the 
controversy about it. 

2. Comparison of the responses of persons 
and households who appeared in both the 
1950 and 1952 samples does not indicate very 
clearly the source of the increased response 


206 


in favor of the project, but does suggest that 
when changes occurred, they were more likely 
to be from oppose to undecided, or undecided 
to favor, rather than from oppose to favor. 
3. What changes did occur must be con- 
sidered to be the result of the appearance of 
the project rather than the characteristics of 
the new residents, since interviewing was com- 
pleted before the units were occupied. 


Kenneth E. Clark and Charles E. Swanson 


4. Of incidental interest is that, by means 
of several call-backs plus the use of well 
trained interviewers, the number of refusals 
to be interviewed was kept below two per 
cent in both 1950 and 1952, and the number 
of interviews completed in the two fixed- 
address samples was maintained at about 96 
per cent. 


Received August 11, 1952. 


ee — eee ee ee Pe 


) 


THE Journat or AppLiED PsycnoLocy 
Vol. 37, No. 3, 1953 


Group Performance in a Manual Dexterity Task 


Andrew L. Comrey 


The University oj California at Los Angeles 


The research to be reported in this article 
was designed to provide some information on 
the following questions: (1) how well might 
the performance of a pair of individuals on 
a manual dexterity task be predicted from a 
knowledge of the individual manual dexterity 
Scores of the two persons making up the 
group; and (2) does the relative level of 
group performance seem to be more closely 
associated with the lower or the higher of the 
Individual performers. 


Experimental Procedure 


_ Each “group” studied in this research con- 
sisted of two men. One hundred and thirty 
volunteers were recruited about equally from 
undergraduate and graduate students to make 
UP 65 groups. No attempt was made to con- 
trol the placement of individuals into groups. 
‘ome pairs were composed of friends, but in 
the majority of cases the individuals were 
either unknown to each other or were only 
casually acquainted. 


The subjects in each pair were brought into @ 
ean lighted and Sea experimental toor and 
Pros across from each other at a table of ap 
ximately office-desk proportions. $ 
Tn fren was Seated a the Far end ot e table. 
ront of each subject was & +4 
e est, the two eae touching each oe a tha 
re Containing the peg cups: The 4 adao iy 
TUctions for the Purdue Pegboard, ing ns 
a Were read to the subjects. Fo loine ee 
=X standard trials were taken, the. — 
sures being recorded after each trial e A 
Subject individually. Each subject W Ri 
pariso? well the other person Was doing 
“pron with his own performance? 
id Wing the completion 0 
muta perience, ja of the pegbos 
subjes and the other was placed be prd perpen- 
dig ects, the long direction of the b rd Pd the 
elar to the axis through the subjects, 
ing away from the experimen er Ae 
g Mstructions were tead by 


Th the second part of h 
yo, Work on the same t; her r k 
ane will work together T!° >). Jeft) will pic 
ally, First (subject A—°” = 


Will 


207 


up a peg and place it in the first hole of the 
row nearest you. Then (subject B—on E’s 
right) will pick up a washer and place it over 
the pin. Then (subject A) will pick up a 
collar and place it over the washer. Then 
(subject B) will pick up a washer and place it 
over the collar, completing the first assembly. 
At the same time, (subject B) will pick up a 
peg with the other hand and place it in the 
second hole of the row nearest him. Then 
(subject A) will place on a washer, (subject 
B) will put on the collar, and finally (subject 
A) will place on the final washer, picking up a 
peg at the same time with the other hand and 
placing it in the hole diagonally across from 
the assembly being completed. Thus, the as- 
semblies zigzag down the board, each person’s 
assignment alternating on each successive as- 
sembly. Now do a few assemblies for prac- 
tice.” 


When it was clear that the subjects understood 
the nature of the group task, six trials of one 
minute each were taken, scores being recorded 
for each trial. Scoring was the same as for the 
individual assembly task. This completed the ex- 
perimental session, usually taking 25 to 30 min- 
utes. In reading the instructions for the group 
task, the subject’s name was inserted in the ap- 
propriate space, italicized in the text given above. 


Treatment of the Data 


The statistical analysis of the data proceeded 

i ollowing steps: 
in ity Se each pereo, individual assembly scores 
on trials three and five were added together and 
scores on trials four and six were added together, 
These “split-half” scores were used to obtain re- 
liability estimates and also were added to give a 
total individual performance score. The same 
rocedure was followed for the group” scores. 
2) The members of each pair were designated 
as “high” or “low,” respectively, on the basis of 
their total individual performance scores com- 
uted in (1) above. The person of the pair 
with the higher total was automatically classified 
as “high” and his partner was classified as “low.” 
Many jndividuals in the “low classification had 
higher scores than some persons in the “high” 
classification. This apparently arbitrary method 
of dividing subjects was adopted because one 
objective of the experiment was to determine 
whether the lower or the higher of the two indi- 
vidual performers of a pair would have a greater 
influence on their group effort. Common sense 


208 


Andrew L. 


Comrey 


Table 1 


Summary of Results 


Corrected r with 


Beta 
Score M o ru High Low Group Weight 
High 186 16.5 90 1.00 52 56 35 
Low 173 16.8 92 52 1,00 59 Al 
Group 178 19.2 87 56 59 1,00 
R = 66 R? = 44 


might suggest that the pair could do no better 
than the poorer man,.in a normative sense. 

(3) Pearson product-moment correlations were 
computed between the “split-half” scores in the 
“high” and “low” categories and also for the 
“group” performances. Thus, the 65 persons in 
the “low” category each had two half scores. 
These scores were correlated and the correlation 
corrected for doubled length by the Spearman- 
Brown prophecy formula to obtain a reliability 
estimate for total individual performance scores 
in the “low” category. The same procedure was 
followed for those persons in the “high” classifi- 
cation and for the pairs of individuals involved 
in the “group” performance. 

(4) Pearson correlations were computed be- 
tween “high” and “low” individual performances, 
between “high” individual and “group” perform- 
ances, and between “low” individual and “group” 
performances. These correlation coefficients were 
corrected for attenuation in both variables in- 
volved, using the estimates of reliability obtained 
in (3) above. The correlations so obtained were 
treated as estimates of the values which might be 
expected between the given variables had the 
measures involved been entirely free of errors of 
measurement. 

(5) A coefficient of multiple correlation be- 
tween “group” performance and the respective 
“high” and “low” individual performances was 
computed, using the coefficients of correlation 
corrected for attenuation, as obtained in (4) 
above. The multiple correlation was computed 
using correlations corrected for attenuation be- 
cause it was desired to have an estimate of the 
maximum amount of variance which could be 
predicted under ideal conditions, i.e., with errors 
of measurement absent. The idea was to gain 
some indication of how much variance might be 
attributable to certain additional unknown vari- 
ables. 

(6) Beta weights for the “high” and “low” 
Scores, respectively, were computed for the re- 
gression equation to predict “group” performance 
scores. 


Results 


The results of the statistical analysis have 
been summarized in Table 1. Inspection of 


the scatter plots revealed no indication of 
curvilinear regressions, although the plot be- 
tween “high” and “low” scores had a restric- 
tion due to the fact that the “low” score of s 
pair could not be greater than the “high 

score. In the first column of Table 1 ae 
listed the total score categories, “high,” “low, 

and “group,” standing, respectively, for those 
total performances as described in the previ- 
ous section. The means and standard devia- 
tions of the three sets of scores are given in 
the second and third columns, respectively- 
These are based on the totals of the last four 
of six trials. This procedure was decided 
upon in advance to obtain more stabilized re- 
sults. In the fourth column are given the 
reliability estimates as obtained by the pro- 
cedure described in (3) of the last section. 
The next three columns of Table 1 give the 
intercorrelations of the total score variables, 
corrected for attenuation. The steps were 
described in (4) of the section on treatment 
of the data. The last column contains the 
beta weights for predicting “group” perform 
ance from “high” and “low” individual pet 
formances. The multiple correlation, R, a” 

R®, as described in (5) of the last section, 47° 
given in the bottom row of the table. 


Discussion 


_ Information pertaining to the first question 
in this research is given by the multiple co"; 
relation coefficient between “high” and “low 

scores and the “group” score. The square o 
that coefficient indicates that 44 per cent ° 


, To determine the effect of this artificial reste 
n, a pairs of two-digit atinbers were taken fror 
hi atl of random numbers, placing arbitrarily the 
hig er of the two numbers in the first group and it- 
„ower of the two in the second group. The resu 
ing Pearson Correlation was .56, 


m 


Group Performance in Manual Dexterity Task 


the variance in an errorless measure of group 
Performance could be predicted from a linear 
combination of perfectly reliable “high” and 
low” individual scores. The percentage 
which can be predicted by fallible measures 
would be less. 
Three possible explanations of this result 
Will be given. First, the task itself may ac- 
tually be significantly different in its nature 
from the individual assembly task. It was 
necessary for the subjects to alternate opera- 
tions on succeeding assemblies when working 
together which was not the case in individual 
Operations. In future work, a redesigned in- 
vidual task will be used which requires the 
Subject to interchange the sequence of hand 
Movements on alternate assemblies. This 
Should make the operations in the individual 
task more like those in the group task. 
i ven though the previously mentioned dif- 
erence in the individual and “group tasks 
Were eliminated, this would by no means In- 
icate that the “group” task would then he 
„S same to the participating persons as their 
dividual tasks, The two tasks are probably 
ferent to each individual not only because 
an intrinsic difference in the sequence or 
Maracter of the operations but also because 
© group situation brings in new elements 
wluiting the utilization of different abit 
€ Person must anticipate the moves o : 
Partner to achieve a smooth performance. 1 
ort, it is suggested that there may pe 
P of abilities possessed to different res 
n S by different individuals which dine 
Part how well they will perform p= be 
i up situations, These new abilities may pg 
De “Pendent of those which dete me the 
Tformance of the same operations | mgle 
De © sequence by the individual as 
tformer, 
a third possible explanatio 

on les in the hypothesis 4 
e. 8 individuals. It may ivel 
he all subjects will work more end 
Ae Some individuals tha? | wit Se 
Boe this hypothesis, variat ly influence 
h Ormance may be substantially 

y 

Wh, © extent to which 


Will work best together: 


t abilities. 


Bree 
1 


n of these Tê- 
f interactions 
at some OF 


persons are P 


209 


The second objective of the experiment was 
to determine if the lower individual per- 
former of a pair influenced the group per- 
formance more than the higher individual 
performer. This question may be answered 
by an examination of the correlations of the 
“low” and “high” scores with the “group” 
scores. The correlations, corrected for at- 
tenuation, were .53 and .50, respectively. 
The difference in effect on group performance 
by the “high” and “low” pair members is so 
slight as to be of no practical consequence. 
Thus, group performance here seems to be a 
function of the average of the two individual 
scores. Under these conditions, for a given 
group of workers, there would seem to be 
little to gain by trying to pair off the high 
ones and the low ones, expecting thereby to 
get more over-all production from the group 
as a whole. This conclusion naturally pre- 
sumes a similar type of prediction situation 
and the lack of further information beyond 
that which was available here. 

The same statistical treatment was also 
iven to the data from the first two trials. 
Reliabilities were somewhat lower, .76, .82, 
and 74 for the “high,” “low,” and “group” 
scores, respectively. Intercorrelations among 
se scores, corrected for attenuation as be- 
fore, were “low-high,” 53, “low-group,” .67, 
and “high-group,” 58. R? was 5L Thus, 
during the practice trials, the low men in- 
fluenced the group performance slightly more 
than they did in later trials. Also, during 
these trials, the group scores were more highly 
related to the individual performance scores 
than during the test trials, as shown by the 
higher R°. The stabilized performance re- 
sults probably have the greater practical 


value, however. 


the: 


Summary 


— airs of volunteer male universit, 
a given six trials on the aa 
d, Assembly Task, and six trials on 
the ‘Assembly Task with the two members of 

ach pair working together on the same as- 
e; mblies rather than individually on separate 
i The members of each pair were di- 
vided on the basis of the total of the last four 
individual trials, Assembly Task, into “high” 


stu 
Pegboar 


210 


and “low” categories. Reliabilities were de- 
termined for “high,” “low,” and “group” per- 
formances, using alternate trials and correct- 
ing for doubled length. Correlations of the 
“high” and “low” performances with the 
“group” performance and with each other 
were computed and corrected for attenuation. 
The multiple correlation and regression 
weights were obtained for predicting “group” 
performance from “high” and “low” individ- 
ual performances. 

The results showed that less than half the 
group performance variance could be pre- 
dicted from a knowledge of the individual 
performances, even with the effect of er- 
rors removed. It is suggested that manifest 
differences between the “individual” and 
“group” tasks, interactions among individ- 
uals, and a constellation of abilities in the 
general area of cooperation may account for 
the variance not predicted by perfectly re- 
liable individual performance scores. 


Andrew L. Comrey 


The level of group performance was only 
slightly more dependent on the “low” individ- 
ual performances. For all practical purposes 
equal weights could be used for “high” and 
“low” scores in predicting “group” perform- 
ance. 

Two practical implications of the results 
of this experiment are as follows. First, 
industrial situations where two or more mM- 
dividuals must cooperate on a given task, it 
must not be assumed that individual perform- 
ances on a similar type of task will account 
for most of the variation in group perform- 
ance. Secondly, for a given group of persons 
there seems to be little point in taking the 
trouble to pair them on individual ability ™ 
a related type of individual task since grouP 
performance seems to be dependent on the 
approximate average of their individual scores 
rather than the high or low individual pet 
formance. 


Received August 4, 1952. 


Tue Journxat or AppLIED PsycroLocy 
Vol. 37, No. 53 


Response Time as an Indicator of Color Deficiency * 


Sherman Ross and 1st Lt. John L. Fletcher, MSC, USA 


University of 


_In 1907 Froeberg (2) reported that reac- 
tion time varied inversely with the intensity 
of the stimulus. Since that time reaction 
time measures have been put to a variety of 
uses. Steinman (6) reported that simple re- 
action time to stimulus change was an ade- 
quate method for studying sensitivity to 
stimuli. She also found that reaction time de- 
creased as the magnitude of the change in- 
Creased. In a study whose purpose was to 
determine the speed and accuracy of dis- 
criminations of hue, brilliance, area, and 
Shape of visual stimuli, J. B. Reed (4) found 
that as the difference between two areas or 
ues is increased, discrimination time is de- 
creased, 

When pseudo-isochromatic plates are 
the subject reacts to a stimulus complex. 

his suggests that if the discrimination re- 
quired is difficult, the response time would be 
longer than if the discrimination were easy. 
Further, if color contrast is absent on the test 
Plate the subject would hesitate and seek 
Other cues, such as differential brightness, as 
a basis for responding. 

Reports from several investigator u 
Port to this notion. J. D. Reed (5) studied 
reactions to a complex submarine signal panel 

ard, and reported that use of reaction = 
Measures revealed the increased difficulty o 
discriminations for color defectives. on 
Studies by Pickford (3) and Sultzman Au 
Tefer to hesitancy on the part of the color de 
“ctive individuals. 
Ok fied basis gf thes! ak defective in- 
diy: cy on the part 0 as tasted: 
Viduals, the following hypothesis W35 nger 
Olor «q +35 eae ta als will have long 

efective” individua 
= ress the 


used, 


s lend sup- 


d observations 


jr sincere 


The wri +e to exP! S.C. 
Uiks i go PE n Farnsworth, n : 
Nan » Head. Visual Engineering ew London; 
Conc! Medical Research Laboratory: pt, ‘phe opin- 
ions > for his review of the anuscrP er are those 
of S or assertions expressed in this P' P oe of the 
miie Writers and ee not necessary 

ary departments. 211 


Maryland 


mean plate response times than individuals 
classified as “normal.” 


Method and Procedure 


Test. The color test used in this experi- 
ment consisted of a set of 15 pseudo-isochro- 
matic plates (14 diagnostic and 1 demonstra- 
tion) selected from the American Optical 
Company test * by Farnsworth and called the 
“Proposed Armed Forces Color Vision Test 
for Screening” (1). 

Subjects. A total of 136 students (108 
male, 28 female) from the University of 
Maryland were used in the experiment. 

Method. Test plates were presented singly 
to the subjects in the order prescribed by 
Farnsworth (1). The subjects were tested 
twice in succession. Half of the subjects were 
given a criterion trial first, while the remain- 
ing half were given a fest trial first. The 
criterion trial was conducted exactly as recom- 
mended by Farnsworth (1) except that the 
subjects were instructed to respond to the 
plate as soon as possible. If no response was 
made in 3 sec., the plate was removed. The 
test trial differed from the criterion trial only 
in one respect, i.e., response times were taken 
for each plate in the test trial. 

Apparatus. Illumination was provided by 
a Macbeth Daylight lamp (No. ADE 10) as 
suggested by Farnsworth (1). The test plates 
appeared against a flat black background and 
were placed on a bracket which slid rapidly 
into a viewing aperture. Presentation of the 
plate started an electric chronoscope cali- 
brated in .01 sec. The subject’s verbal re- 
sponse activated a voice key and stopped the 
timer. Verbal reports and response times as 
described above were recorded. 


Results and Discussion 


The error scores made by the subjects on 


the criterion trial were used to classify the 
chromatic Plates for Testing Color 


-Iso x i 
pseudo evised Selection, American Optical 


Perception (R 
Company). 


212 
40 
35 o Normal 
n 
a Normal - Defective 
S30 
= Defective 
2e5) 
S 
= 
720] 
o 
als 
w 
2 
Zio 
z 


a 


30 60 90 120 150 180 2iỌọ 240 
MEAN RESPONSE TIME (.0ISEC) 


Fic. 1. Mean plate response time frequency dis- 
tribution for normal and color defective subjects 
(N = 136). 


subjects as normal or defective. The norms 
recommended by Farnsworth (1) were used 
in the classification; i.e., four errors or less 
for normals, five errors or more for defectives. 

Using this criterion, 110 subjects (84 male 
and 26 female) were classified as normal. 
The mean plate response time for the normal 
group was .66 sec., ranging from .31 sec. to 
1.42 sec. The defective group had 26 mem- 
bers (24 male and 2 female), Mean plate 
response time for the defective group was 1.93 
sec., ranging from 1.24 to 2.40 sec. Fig. 1 
shows the mean plate response time distribu- 
tion for all subjects, normal and defective. It 
should be noted that subjects classified as 
normal by the criterion test are also cate- 
gorized as normal by mean plate response 
times, and that the deficient subjects, with the 
exception of two cases, are classified as de- 
fective by both error score and response time 
measures. 

The ¢-test was used to test the significance 
of the difference between the means of the 
distributions of mean plate response times for 
the normal and defective subjects. The dif- 
ference was found to be significant at less 
than the .001 level of confidence, 

Table 1 presents a plate-by-plate analysis 
of the errors made by the normal and defec- 
tive subjects. The first analysis of the error 
scores consisted of a t-test to determine 
whether or not a significant Practice effect 
existed. We were unable to reject the null 


Sherman Ross and John L. Fletcher 


hypothesis for the defective group, but were » 
able to reject the null hypothesis at less than 
the .01 level of confidence for the normal 
group. This could mean that there is no ap- 
preciable change in error scores for defective 
subjects. On the other hand, normal subjects 
made significantly fewer errors on their second 
test trial. 

It may be noted from Table 1 that normal 
subjects in varying numbers missed plates 2, 
6, 10, 11, 14, and 15. Specifically, 51 per 
cent of the normal subjects missed plate 6, 
and 39 per cent missed plate 15 on the first 
trial. Errors were made on every plate ex 
cept the demonstration plate by the defective 


Table 1 


Number of Errors per Plate by Normal and Defective 
Subjects for the First and Second Trials 


al Defective 
ow 20) (N = 26) 
Plate - — ‘al 2 | 
No. Triall = Trial 2 Trialt Tria 
2 9 3 20 22 
3 25 25 | 
4 16 17 
5 20 23 
6 56 38 25 24 
7 j 20 17 
8 16 16 
9 18 f 
10 2 2 16 16 
11 14 14 24 24 
12 19 20 
13 13 14 
14 1 1 20 18 
15 43 23 25 25 


individuals. Plates 2, 3, 5, 6, 7, 11, 14, and 
15, however, were missed by a larger numbe 
of these subjects than the other plates. 
Tt was stated earlier that response timi 
measures might prove useful as an indicat 
of color deficiency, An examination of ee 
1 reveals that the mean plate response de 
frequency distribution of the normal and 
fective subjects overlaps in only one ite 
1.20-1.35 sec. With the exception of the ' 


cases represented, all subjects with a moh 
Plate response time of 1.50 sec. or ovel Id 


an over-all response time of 21.0 sec.) © 


f 


Response Time, as Indicator of Color Deficiency 


be classified as deficient, and those with re- 
Sponse times less than 1.50 sec. as normal. 
It may well be possible that the limit of 3 
sec. employed put an artificial limit on the 
response times. A  color-blind individual 
would not necessarily, at the end of this 
period make the correct response or any re- 
sponse. On the other hand, a normal or near- 
normal subject, who usually eventually makes 
correct responses, might well be categorized 
by a response time method. For further in- 
vestigation of the practical usefulness of this 
method, the 3-sec. ceiling should be extended. 
The problem in classification is that of the 
near-normal subjects. The number of border- 
line subjects in the present experiment was 
limited, and these borderline subjects con- 
stitute the difficult classification problem. 
ost tests will roughly separate the extremes 
of the population. The problem is to devise 
tests to select the Class II group (mildly de- 
fective color vision) as classified by Farns- 
Worth (1). x 
Further research is indicated along this 
general line since it is entirely possible that 
response time measures could serve in classi- 
fication, It is possible that memorization of 
the plates could be detected by this method. 
Response time measures could readily be used 
™ military and industrial situations. 


Summary 


d 
Response time measures to 15 Sle 
Plates of the AO pseudo-isochromatic test fo 


th Or perception were secured for i . a 
res (28 females, 108 males). O a 
OUP, 110 were classed as normal, 26 as 


213 


fective, on the basis of their error scores. 
Each subject was given two successive tests: 
criterion and test trials. Mean plate response 
times between the normal (0.66 sec.) and de- 
fective (1.93 sec.) groups were found to differ 
significantly. Practice effects were noted 
within the normal group, but were not found 
in the defective group. It was concluded that 
response time measures could be used in the 
separation of color normal from color defec- 
tive individuals. 


Received August 14, 1952. 


References 


1. Farnsworth, D. Proposed armed forces color vi- 
sion test for screening. US.N. Submarine 
Base, Medical Research Laboratory, Report 
No. 180, 1951, 10, 146-155. (Color Vision 
Report No. 24.) 

2. Froeberg, S. The relation between the magnitude 
of the stimulus and the time of reaction. 
Arch. Phil., Psychol., Sci. Meth., 1907, No. 8, 
38. 

3. Pickford, R. Individual differences in colour vi- 
sion. London: Routledge and Kegan Paul, 
1950. Pp. 386. 

4. Reed, J. B. The speed and accuracy of discrimi- 
nation differences in hue, brilliance, area, and 
shape. U.S.N. Special Devices Center, Port 
Washington, L. I, N. Y. Tech, Rept. SDC- 
131-1-2, Sept. 1951. 

5. Reed, J. D. A note on reaction time as a test of 
color discrimination. J. exp. Psychol., 1949, 
39, 118-121. 

6. Steinman, A. Reaction time to change. Arch. 
Psychol., 1944, 41, No. 292. Pp. 60. 

7. Sultzman, J. H., Lieut. (MC) USNR. Compari- 

j son and evaluation of the American Optical 
Co. Pseudo-Isochromatic Plates, First and 
Second Editions. U.S.N. Submarine Base, 
Medical Research Department, BuMed-X—480 


(Au-255-p), 16 July 1945. 


THE JOURNAL OF APPLIED PsycHoLocy 
Vol. 37, No. 3, 1953 


The Effect of Set on Performance in a “Trouble Shooting” 
Situation 


Nicholas A. Fattu and E. Victor Mech? 


Institute of Educational Research, Indiana University 


With equipment-systems periodically gain- 
ing in complexity, both industry and the 
Armed Forces are faced with the task of 
training personnel in the maintenance of these 
equipment-systems, However, the problem 
of how to systematically train maintenance 
men has thus far received meager experi- 
mental treatment. 

It has frequently been postulated that the 
trouble shooting process of locating defects in 
equipment-systems is similar to that encoun- 
tered in problem solving situations. When a 
mechanic is confronted with an equipment- 
system that is “malfunctioning” and no ap- 
propriate response is available, the process of 
locating the defective part clearly possesses 
the properties of a problem situation. Unfor- 
tunately, the gap between the problem solving 
literature and understanding complex trouble 
shooting processes is not easily bridged. Al- 
though several Promising concepts are em- 
bodied in the problem solving literature ap- 
parently there is no simple transplanting of 
these notions to problems encountered in the 
trouble shooting of complex equipment-sys- 
tems. 

The purpose of the present experiment was 
to test a concept found important in previous 
problem solving studies by applying it to a 
more realistic problem situation. 

The studies of Maier (2, 3) suggest that a 
knowledge of the required Parts of a solution 
does not necessarily mean the Occurrence of 
a solution. Evidence is presented that indi- 
cates additional information in the form of a 
set is necessary; necessary in the sense that 
the additional set clears the way and increases 
the probability of a correct solution. 

The hypothesis tested in this experiment 
was that the ability to “trouble shoot” or lo- 
cate defects in an equipment-system entails 


1 The authors are indebted to Mr, Walter Ciszezon 
for material contributions to apparatus, and to Mr. 
Jasper Smaliks for aiding in the derivation of the 
symptom analysis “trouble shooting” method. 


more than being trained in the basic com- 
ponents, or essential parts of the equipment. 

It was postulated that in addition to teach- 
ing basic components something more was re- 
quired in the form of a set of principles deal- 
ing with systematic location of a “malfunc- 
tioning” component. 


Method 


Apparatus. The apparatus in Figure 1 is called 
a gear-train consisting simply of a set of ar 
and shafts mounted on a piece of aluminum 4 
inch thick, 29 inches in length, and 20 inches w 
width. The gear-trains were arranged to foun 
two series and four parallel channels that pro 
vided for crossed information chains. , he 
Two operating controls, A and B, provided t $ 
input necessary to obtain the desired motion: 
The motion was transferred through the geme 
trains and as an end result closed a switch mas 
caused a series of red lights to illuminate t 
control panel. +e the 
The red lights would illuminate only if ie 
equipment was functioning properly and emer 
was turned 13 times and control B, 12 nS 
When the appropriate number of turns was m t- 
and the expected end result (control panel peni 
ing up) did not occur, this indicated to S the 
there was a “malfunction” in the gear-train. hs 
Malfunctions, For the purposes of the ene 
ment only one class of “malfunction” was Tar 
lized. It was a defect of the “slipping gear Yy 
produced by loosening a set screw and found, 4 
the authors in a previous study (1) to be imn 
difficult. Each S received six malfunctions 7 
this type on the pre-test and six malfunctions © 4 
the same type on the post-test, making 12 mî 
functions that each S was required to locate. ane 
Ss were presented the malfunctions in a_% g 
dom order, and, in addition, each of the malfune 
tions was inserted at a location determined fro 
a table of random numbers. im- 
Procedure. Ss were run individually, and the 
mediately upon entering the laboratory for ro- 
first time E gave S the Standard Operating "ne 
cedure (hereafter referred to as S. O. P.) for ing 
apparatus. The S. O. P. consisted of tur 
control handle A, 13 turns and control handle 5 
12 turns, and if the red lights on the control pring 
did not light up, it indicated to S that somet a 
was Wrong with the gear-train and the task Y k 
to “trouble shoot” the equipment. After the 


214 


Effect of Set on Performance in a “Trouble Shooting” Situation 


O. P. orientation, S was given six malfunctions 
or problems to locate. Each problem was given 
singly, S being placed in an adjoining room while 
the malfunction was being inserted by Æ. A 
time limit of 15 minutes was allowed for each 
malfunction. 
_ When each S had completed working on the 
initial six “malfunctions,” the following procedure 
was followed. Ss in Group 1 were taken to an 
adjoining room for a 20-minute period during 
which they were allowed to read a current Life 
Magazine. Ss in Group 2 received a tape-re- 
corded “basic knowledge” lecture. Integrated 
with the lecture were slides projected on a screen 
with a 35 mm. camera. The rationale of the 
lecture was to convey the basic nomenclature 
and function of the gear-train apparatus. In- 
cluded were such concepts as transfer of motion, 
and the function of gears, bearings, and shafts. 
Group 3 received the basic knowledge information 
given to Group 2, and, in addition, was given a 
tape-recorded lecture on how to “trouble shoot’ 
e gear-train. This “trouble shooting” lecture 
was based on a simplified version of a previously 
developed symptom analysis guide to “trouble 
Shooting.” The lecture stressed starting with the 
Sreatest magnitude of error, locating the first cor- 
rectly operating component nearest the greatest 
error, and once these two points were bracketed, 
Ocating the defect somewhere between.” 


? The symptom analysis “trouble shooting” lecture 
has been filed with the American Documentation In- 


stitute, Order Document 3968 from ADI Auai 


Publications Project, Photoduplication Service, 


Fic. 1- 


The gear-train 


215 


Subjects. The Ss were 54 college students en- 
rolled in the School of Education at Indiana 
University. Three groups of 18 Ss each were 
used in the experiment. Assignment of the 54 
Ss to each of the three groups was done from a 
table of random numbers. Participation in the 
experiment was required in order to eliminate the 
bias often found by asking for volunteers. 


Results 


The raw data for this experiment are the 
post-test gains or the number of malfunctions 
correctly located minus the number located 
on the pre-test, and the total time required 
to reach a decision as to the location of the 
malfunctions. Arriving at a decision, how- 
ever, does not necessarily mean that it was a 
correct one. 

Figure 2 illustrates graphically the post- 
test gains made by the three groups of sub- 
jects in “trouble shooting” the gear-train ap- 
paratus. Statistical significance of post-test 
gains was tested by an analysis of variance 
of the thirteen possible scores ranging from 


6 to — 6. 


brary of Congress, Washington, D. C., remitting 
$1.25 for microfilm (images 1 inch high on standard 
35 mm. motion picture film) or $1.25 for photoprints. 


apparatus. 


216 Nicholas 
41 

[e] 

= 

S 

S 
oO 

fa] 23 

= 

9 

= 

= 

ire 

3 15 

= 

z 

A 

3 

GROUP | GROUP 2 GROUP 3 


Fic. 2. Post-test gains in m 


alfunctions correctly 
located over the initi 


al measures, 


Prior to computing the analysis of variance, 
Bartlett’s x? test was utilized to test the 
homogeneity of variance of defects located. 
Since the x? value of 2.26 did not reach the 
5% level (df = 2), the hypothesis of no dif- 
ferences among gtoup variances could not be 
rejected. 

The analysis of variance performed on post- 
test gains is summarized in Table 1. The ob- 
tained F value of 9.36 for 2 and 51 degrees 
of freedom was significant at the 1% level of 
confidence. 

It appears, then, that th 


e significant gains 
demonstrated by the 


Performance of Group 3 
subjects can defensibly be attributed to the 
effects of the “trouble shooting” lecture that 
the two remaining groups did not 


receive, 
However, a glance at Figure 3, 


showing the 


Table 1 


Analysis of Variance of Differe 


neces Between Initial 
and Test Location of « 


Malfunctions” 


Sum of Mean 
Source of Variance df Squares Squares F 
Between Groups 2 27.88 13.94 9.36* 
Within Groups 51 76.01 1.49 
Total 53 103.89 


* Significant beyond the 1% level of confidence, 


A. Fattu and E. Victor Mech 


Table 2 


i i > ation 
Covariance Analysis for Time to Decide the Locati 
of Pre- and Post-Test Malfunctions 


Sum of Mean : 
Source of Variance Squares df Squares : 
Total 618.37 52 4.07 
Within Groups 531.79 50 10.64 
Adjusted Means 86.58 2 43.29 


* Significant at the 5% level of confidence. 


time in minutes for the groups to reach a de- 
cision with respect to where various malfunc- 
tions were located, points out an inten 
reversal. Although Group 3 was superior re 
locating defects under test conditions, | it a 
clear they failed to decrease subsequent A 
ble shooting” time, while the time required A 
the remaining two groups was reduced. t 
order to test for differences with regard = 
time taken to reach a decision as to the loga 
tion of defects, an analysis of coinage m 
shown in Table 2, was carried out between t 
pre-test and post-test time measures. s% 
The F of 4.07 was significant at the ip 
level indicating that the means of the gro P 
on the post-test time measures cannot be “of 
counted for by differences in mean level 


132.8 


@ PRE-TEST 
O POST-TEST 


112.6 112.5 


LOCATION TIME IN MINUTES 


GROUP I GROUP 2 GROUP 3 
t 
Fio. 3. Comparison between the pre- and pee 


son O 
test of total time required to decide the location 
malfunctions, 


Effect of Set on Performance in a “Trouble Shooting” Situation 217 


initial ability as measured in the pre-test 
trials. 

Accordingly, the results of this study are 
that the additional “trouble shooting” lec- 
ture acted to produce a differential effect on 
subsequent performance in locating gear-train 
defects. The group which received both the 


, basic knowledge and “trouble shooting” sets 


did not appreciably reduce the time scores in 
comparison with the remaining groups. 

Time is a rather dubious criterion of per- 
formance in the trouble shooting situation. 
A comparison of Figures 2 and 3 (gains and 
time) suggests that the longer time required 
by Group 3 may be attributed to deliberation 
required for an accurate judgment, while in 
Group 1 the small time required might be at- 
tributed to snap judgment. 

In the final analysis, the findings suggest 
that besides learning about various com- 
Ponents of an equipment-system, systematic 
training in “trouble shooting” methodology is 
required in order to obtain efficient results. 

Intensive research with more complex equip- 
Ment is needed to determine the additional 
Skills and knowledges required to “trouble 
shoot” successfully. Such a problem as de- 
termining which trouble shooting procedure 
'S more generalizable than another, and test- 
Ing its transfer power on succeedingly com- 
Plex equipment is, indeed, a fascinating lab- 
Oratory challenge. By using the laboratory 
Method on these and related problems, much 
Useful information about problem solving in 
Seneral, and the significant variables of “trou- 
le shooting” in particular could be systema- 


tically obtained. 


Summary 
ed tested the hypoth- 


ble shoot” or locate 
-system re- 


The experiment report 
esis that ability to “trouble 
efects in a specified equipment 


quires something more than being trained in 
the parts or components of that equipment- 
system. Fifty-four undergraduate students 
enrolled in the School of Education at Indiana 
University participated in the experiment. 
Certain training factors were common to the 
three groups. All Ss received identical in- 
doctrination in the Standard Operating Pro- 
cedure for a gear-train apparatus, after which 
each S was given six problems or malfunctions 
to locate in the equipment. This procedure 
was used to obtain a pre-test measure of 
“trouble shooting” ability on the gear-train. 
After the initial measure, Group 1 received 
no further information, Group 2 received a 
tape-recorded basic knowledge lecture that 
explained the nomenclature and functioning 
of the gear-train, while Group 3 received the 
basic knowledge lecture plus symptom analy- 
sis lecture designed to aid in “trouble shoot- 
ing” the gear-train apparatus. 

The post-test gains indicate that the addi- 
tional “trouble shooting” lecture acted to 
produce a significant gain in malfunctions cor- 
rectly located. Time required, however, de- 
creased for Groups 1 and 2, but remained con- 
stant for Group 3. It is suggested that time 
is a dubious criterion of “trouble shooting” 


performance. 


Received March 6, 1953. 
Early publication. 


References 


1, Fattu, N., and Mech, E. V. Interruption: Its ef- 
fect upon performance in a trouble-shooting 
situation. J. Psychol., in press. 

2. Maier, N. R. F. Reasoning in humans: I. On di- 
rection. J. comp. Psychol., 1930, 10, 115-143. 

3. Maier, N. R. F. Reasoning in humans: II. The 
solution of a problem and its appearance in 
consciousness. J. comp. Psychol., 1931, 12, 
181-194. 


THE JOURNAL OF APPLIED PsycuoLocy 
Vol. 37, No. 3, 1953 


An Evaluation of Two Experimental Charts as Navigational 
Aids to Jet Pilots * 


John E. Murray 


Dunlap and Associates, Inc., Stamford, Connecticut 


A modern trend in the progress of aviation 
is toward the provision of improved facilities 
for the pilots of high-speed, high-altitude air- 
craft. One of the basic tools required by such 
pilots is the aeronautical chart. A review of 
the literature and an examination of current 
charts show that existing charts fail, in some 
respects, to provide the pilots of high-speed, 
high-altitude aircraft with a highly effective 
navigational tool. Much of the material pre- 
sented on the charts is superfluous: some of 
the natural features cannot be seen from high 
altitudes, and much of the chart content can- 
not be absorbed in the time available for 
navigation at high speeds. 

An increase in the number of charts was 
not accompanied by a judicious selection of 
the chart content. As more aeronautical in- 
formation became available, it was added to 
the basic chart without consideration of the 
flight and navigational requirements of mod- 
ern planes and air operations. 

This procedure resulted in the production 
of all-purpose charts: charts for use in any 
aircraft on any type of flight. These charts 
are cluttered with information, difficult to 
read and inconvenient to use because of their 
size. To overcome some of these defects, 
special purpose charts were designed but with. 
out the use of adequate criteria in the selec- 
tion and presentation of information, More- 
over, there seems at present to be no well 
established methods by which aeronautical 
charts can be evaluated. Even more striking 
is the fact that, in the history of chart pro- 
duction, very little systematic study has been 
made of the pilot’s task of interpreting the 
information presented on the charts. Con- 
sequently, the major objective of this study 
was to devise experimental techniques ap- 


* This research was supported under the terms of 
the contract between the Office of Naval Research, 
and Dunlap and Associates, Inc., Contract Number 


N8onr 641-05. This paper is a summa of Report 
No. 641-05-6 under that contract, ii aii 


plicable to the evaluation of principles of 
chart construction. , 

From the point of view of the psychologist, 
this study is valuable in that it demonstrates 
the application of psychological methodology 
to the problem of chart evaluation. From the 
cartographer’s viewpoint, the experimental 
evaluation of charts yields two types of in- 
formation which can serve in the future course 
of chart development: (1) a test of the ap- 
plicability of general principles and tech- 
niques of chart construction; and (2) the rela- 
tive value of different methods of presenting 
information. 

In order for information to be used most 
efficiently by the pilot, it should obviously be 
displayed so as to provide maximum legibility: 
This involves a determination of the best 
method of presenting chart information an 
requires a study of the contributions of color, 
type of-symbol, size and style of printed type: 
and other related items to legibility. r 

There are two basic ways in which a chart 
can be evaluated. One is subjective and de- 
pends upon pilots’ opinions which can Þe 
gathered from interviews and systematic ques- 
tionnaires; the other is objective and requires 
the collection of performance test data 0 
various sorts. Both methods have been em- 
ployed in the present study. The design ° 
this study involves the following steps: P 

1. The selection of specific features tO b 
included and evaluated on experimenta 
charts. 


2. The preparation of tests to measure the 
readability of charts, 

3. The preparation of a test to measure the 
effectiveness of charts in representing W j 
the pilot sees from the air. to 

4. The construction of a questionnaire wv 
determine the pilot's attitude toward expel! 
mental charts in terms of their content 4” 
practicability for actual flight conditions: | 

The results of the first step are embo n 
in the structure of two experimental char 


218 


Evaluation of Two Experimental Charts 


These charts differ in the amount, kind and 
method of presenting information to the pilot. 
The significant differences between the two 
experimental charts are displayed in Table 
1. The precise objective of this study was to 
determine the relative effectiveness of the 
present World’ Aeronautical Chart (WAC) 
and the two experimental charts, the XJN 
Chart produced by the Aeronautical Chart 
and Information Service and the XDA Chart 
designed for the Office of Naval Research. 
Representative samples of the three charts 
are presented in Figure 1. 


Evaluation Procedures 


Readability Tests. Tests were designed to 
determine the speed and accuracy with which 
Pilots van find and use information contained 
in the charts. Given the task of reporting 
certain specified information, the speed and 
accuracy of performing this task with each 


219 


chart can be taken as an index of the effec- 
tiveness of their presentation. In effect, these 
are tests of legibility or ease of reading. This 
legibility is a function of type size, symbols 
used, color and the density of the information 
shown on the chart. The relative superiority 
of the charts can be determined for those fea- 
tures in which they differ. The following 
tests of readability were constructed: 

Part I. Airport Information. Flight lines 
were drawn connecting fourteen airports. 
The subject was required to give the airport 
type, elevation, runway length and available 
electronic facilities for each of the airports 
specified. The maximum possible score on 
this test was 50. 

Part II. Radio Information. Similar flight 
lines were drawn connecting various radio aids 
on each chart and the subject was required to 
give the type, frequency and call letters for 
each radio aid specified. The maximum possi- 
ble score was 42. 


Table 1 


Differences in the 


Presentation of Specific Features on the XJN and XDA Charts 


XDA 


Feature 


XJN 


Front of Chart! 


1. Color of land and water areas 


2. Terrain features 


w 


+ Cities 


Yellow and blue 
Hypsometric tints 

Contour lines 

Spot elevations 

Predominant in yellow 
Roads and railroads difieren- 


4. Transportation lines a 
tiated i 
5. Ai J Jet and military airports shown 
pa by runway pattern; civil 
shown by circle; lighting 
and surface facilities indi- 
cated ; 
6. State na and boundaries Shown by dotted lines 
A a j Both airports and radio infor- 
brs mation in magenta 
8. Navigation light lines Shown by solid lines 
9. Die ce conte Along edge from 0 to 1,000 
i miles 


Back of Chart? 


1 
1 


12. 


0. Radio beacon 
1. Broadcasting station 
- Radio range 


Symbol prominent 
Symbol subdued 
Shows N quadrant 


Green and blue 

Shadient tints to approximate three- 
dimensional view 

Highest peak only 

Subdued in gray 

Roads and railroads indicated by same 
symbol 

All airports shown have adequate light- 
ing and hard surface runway and are 
represented by runway pattern; type 
of airport shown in data note; GCA 
and DF facilities indicated 

Not shown on this chart 

Airport information in blue; radio in- 
formation in an improved magenta 

Not presented 

Starts from 0 at either end toward 500 
in center; in bold type on both sides 
of chart 


Symbol subdued 

Symbol prominent 

Differentiates terminal and non-termi- 
nal ranges; shows inbound magnetic 
headings 


irport information are p the 0 i 7 
d airp e XDA, airport and radio information are differentiated in color 


tent on both the front and back of the chart. 


.. |On the XJN chart, radio an 
Cifier on the rely and back of the chart; on th 
ut the symbols for each type of data are consis 
2 On the back of the chart, XJN 


XDA does not present these features put includes t 


y e code, reporting points € 
shows Mors he Atlantic coast line and a list of YG stations. 


resented in the same color and the symbols for each 


ints, fan markers, dumb-bell markers, airways; 
YS; 


eee ay HOPEWELL 
3 ASZ rene 


v 108.7 ORF sm 


John E. Murray 


-| =æ 


F ledntinvons 


oN 
SSK 


Peer, id 
IHX Franklin, 66 = NS 
LET A NORFOLK RADIO 


1 GVE | Fredpficksburde 
388 


SeA Tor|356 


N ELIZABETH OTY 
10/72 


XDA Chart 
Fic, 1. 


Part III. Natural and Cultural Features. 
The subject was required to read and inter- 
pret various features pertaining to terrain, 
roads and railroads, cities, rivers, etc., used 
for navigation in cross-country flight. The 
maximum possible score was 15. 

Part IV. Aerial Photographs. 
was designed to measure the 
ability to read an aerial photogr: 
determine its geographic location 
Seven photographs were selected 
taken at an altitude of approxi 


This test 
individual’s 
aph and to 
on the chart. 
from a series 
mately 40,000 


„mately 45 minut 


Charlottesville 


aNd ns 


3 


XJN Chart 


Sample sections of the WAC, XJN, and XDA charts. 


ash- 
feet on a flight from Dayton, Ohio to Was 


ington, D. C. The subject was require ph 
locate the area Pictured in each photog'@ 
on the test chart provided. by 
Each experimental session was prefaced ur- 
an introductory statement covering the Pi 2 
Pose of the study and the experimental Fave 
cedure to be followed. Time limits of ws 
minutes each were imposed on Parts I and rts 
Seven minutes each were allowed on a 
II and IV. Each session required appt? 


z rere 
es of which 24 minutes W 


Evaluation of Two Experimental Charts 2 


for working time and the remainder for pre- 
liminary instructions. 

The tests were administered to groups of 
20 to 25 pilots at each session. A total of 
72 Navy pilots were tested on the XJN Chart, 
66 on the XDA Chart and 60 on the WAC 
chart for a total of 198 pilots. 

Item Analysis, As a more refined measure 
of the effectiveness of the charts, each of the 
four tests was subjected to an item analysis. 
The number of individuals who marked each 
item correctly was determined and a com- 
Parison among the three charts was made on 
each item, In those instances where items 
were incorrectly marked, the frequency with 
which other alternatives were chosen was Te- 
corded. i 

Questionnaire. 
for specific features 
charts, a questionnaire W 
another group of Navy pilots 
naire consisted of a series O 
cerning the features which w 
Presented on the two charts. 
tion, the pilot was asked to sta 
ence for one of the charts in re 
Specific feature. Where applicable, reasons 
for the choice or preference were also re- 
quested. Free comments, whether favorable 
or unfavorable, were encouraged as much as 
Possible. In all, 43 pilots were interviewed 
with the questionnaire either on an individual 
basis or in small groups of two to four men 
each, 


To elicit pilot preferences 
on the experimental 
as distributed to 
. The question- 
í questions con- 
ere differently 
In each ques- 
te his prefer- 
gard to some 


Results 


Readability Tests. The mean are 3 
tained on each test for each chart is presented 
in Table 2. To determine the effectiveness of 
each of the charts, the test scores were a 
Pared by the standard t-test techniques. T a 
test results indicate that airport, radio a 
cultural information can be read more eed 
and accurately on th mental ¢ The 
than on the traditional Wi TAT 
XDA is significantly superio! f 
Chart in presenting airport information. k 
Superiority is probably due to the promin R 
th the airport symbol and the simpi¢ y 

€ corresponding data note. — 7 

Only shee vifferences exist among the 
charts when used to i id im- 
aerial photographs. T 


N 


Table 2 


Mean Scores Obtained on Readability Tests 
for the Charts Specified 


No. in Standard 
Chart Group Mean Deviation 
Test I 
Airport Information 
WAC 60 37.1 7.31 
XDA 66 46.3 6.46 
XJN 72 43.0 7.41 
Test II 
Radio Information 
WAC 60 34.5 8.76 
XDA 65 39.9 5.28 
XJN 71 39.0 5.51 
Test UI 
Cultural Features 
WAC 60 8.3 2.32 
XDA 66 12.7 1.99 
XJN 72 12.7 1.78 
Test IV 
Aerial Photographs 
WAC 60 3.65 1.22 
XDA 66 3.23 0.97 
XJN 72 3.47 0.76 


ply that the reduction in the amount of detail 
on the experimental charts does not hinder 
the pilot’s identification of reference points. 

Item Analysis. The data from the item 
analysis clearly show the relative value of 
each of the charts as a means of presenting 
information to the pilot. Economy of space 
does not permit the inclusion of the data ob- 
tained for each item. The important differ- 
ences among the charts can be summarized as 
follows: 

1. In the time limit allowed, fewer items 
were completed on the WAC Chart than on 
either the XDA or XJN Charts. This differ- 
ence seems to be due to the mass of informa- 
tion shown as well as to the unsystematized 
placement of the data notes on the WAC 
Chart. Furthermore, the size and scale of 
the WAC Chart make it awkward to manipu- 
late and difficult to locate the information 
required. 

2. The runway patterns on the XDA and 
XJN Charts were more effective than the 
traditional circular symbols on the WAC 
Chart. 


222 


3. Data notes are more readily identified 
when placed closely to their related objects. 
Misidentification of certain airports on the 
XDA Chart, for example, resulted from im- 
proper placement of the data notes pertaining 
to these airports. 

4. Security areas are best shown on the 
XJN Chart. This seems to be due to the 
type size and face used in presenting these 
areas. 

5. In presenting terrain features, the XDA 
Chart is superior to both the XJN and WAC 
Charts. 

Questionnaire. The preferences of pilots 
for the specific features on the two experi- 
mental charts are as follows: 

1. Printed material on the XDA Chart is 
more easily read although the chart has a 
more cluttered appearance. 

2. The mileage scale on the XDA Chart is 
preferred. The scale should range from 0- 
500 miles from either end of the chart and it 
should be presented in the same manner on 
both sides of the chart. 

3. Runway patterns on the XDA Chart are 
preferred by 93 per cent of the pilots inter- 
viewed. It is considered desirable to present 
only those airports with adequate landing 
facilities for jet aircraft. 

4. The bold type for airport information 
on the XDA Chart is preferred and GCA and 
DF facilities are highly desirable. 

5. The radio broadcast symbol on the XJN 
Chart is preferred. It can be distinguished 
easily from the other radio symbols. 

6. On the back of the chart, the radio 
beacon symbol appearing on the XJN Chart is 
preferred; the radio broadcast symbol ap- 
pearing on the XDA Chart is preferred, 

7. Range stations are considered the most 
important radio aids to navigation, 

8. In presenting terrain features, pilots 
prefer the shadient tints of the XDA Chart 
but with the spot elevations of the XJN 
Chart. 

9. Pilots prefer the presentation of large 
cities in yellow as shown on the XJN Chart. 

10. The names of cities are more easily 
read on the XJN Chart. This seems to be 
due to the contrast between the black print 
and the yellow background of the land area. 


John E. Murray 


11. Pilots prefer the differentiation of roads 
and railroads as shown on the XJN Chart. 

12. Radio information is preferred on both 
sides of the chart but the symbols should be 
consistent on both sides. 

13. The inbound magnetic headings on the 
range legs of the XDA Chart and the indica- 
tion of the “N” quadrants on the XJN Chart 
were both highly favored. p 

14. Coastal outlines, cities, roads and rail- 
roads, and terminal ranges are desirable on 
the chart; non-terminal ranges are preferred 
in a less prominent form. 

15. Airways, fan markers, state names and 
boundaries are of minor importance to the 
pilot and need not be shown on the chart. , 

16. The size and scale of the two experi- 
mental charts are satisfactory but a new 
chart combining the best features of both is 
highly desirable. 


Summary 


The major objective of this study was to 
devise and apply experimental techniques 
through which data could be obtained and 
form the basis on which principles of chart 
construction could be evaluated. Some of 
these principles seem obvious but until experi- 
mental data were available, they remained in 
the realm of conjecture, 

In evaluating the charts, data were 0b- 
tained from readability tests, an analysis of 
test items and pilot preferences on a ques- 
tionnaire. The results indicate that the two 
experimental charts designed for navigation 
a high-speed, high-altitude aircraft are su 
perior to traditional charts in presenting 1° 
formation for Cross-country missions. On a? 
over-all basis, the experimental charts are not 
Statistically different from one another. How- 
ever, there are several features on each chart 
which appear to be highly effective in Pre 
senting navigational information to the pilot- 
It seems apparent, therefore, that the idea 
chart for navigation in high-speed, high-alti- 
tude aircraft should include the desirable fea 
tures of each chart with further experiment- 


tion to determine the effectiveness of the” 
interaction, 


Received July 7, 1952. 


Tue JOURNAL or 


ME J APPLIED PSYCHOLOGY 
Vol. 37, No. 3, 1953 


The Relationship between Scotopic Visual Acuity and Acuity at 
Photopic and Mesopic Brightness Levels * 


J. E. Uhlaner, D. A. Gordon, 


Personnel Research Section, Pe 


Adjutant General's Office, Dept. of th 


The present problem is concerned with 
visual acuity at various brightness levels, but 
from a somewhat different point of view from 
that taken in the usual psychophysical experi- 
ment, Classical studies (4, 5) have described 
visual acuity as a function of brightness level. 
In such studies, interest is in mean or typical 
performance. Individual differences are re- 
garded as variability limiting the generality 
of the findings. In the present problem, we 
are interested in assessing the possibility of 
constructing a practical test of night visual 
acuity. In this regard, we are not concerned 
with mean performance, but attention is 
strongly centered on individual differences. 

The Personnel Research Section of the 
Adjutant General’s Office has been under- 
taking, for some time, the development of a 
practical, predictive test of night visual per- 
formance, In early studies carried out at 
Fort Sill (3) in 1944, and at Camp Blanding 
(10), the Army Night Vision Tester (ANVT- 
R2X) was constructed and validated. The 
instrument is satisfactory from the point of 
view of reliability and validity, but it has 
shown itself too cumbersome for general field 
Service. 

The present 
Search Section to nig 
to substitute an acuity tes n j 
(moonlight) brightness (6.75 log pa 
crolamberts) for the scotopic (starlight) test. 
This substitution is desirable because tests 
mesopic acuity involve less adaptation a 
(and hence more rapid testing), less depend: 
ance on light-tight testing conditions, and 
fewer testing personnel. In the practical mii 


pid si i dt wig oe aes 
al in delel ne “ote 
a night vision could be l ple ny one 


Se. 


ch of the Personnel Re- 
ht vision testing attempts 
t given at mesopic 


approa 


* ar nre those of 

the The opinions presented in the pape arene views 
of the ors and do not necessarily reflec 7 
€ Department of the Army- 223 


I. A. Woods, and J. Zeidner 


rsonnel Research and Procedure Branch, 


c Army, Washington, D. C. 


A mesopic acuity test may be substituted 
for a scotopic test, if the relationship between 
the two tests is shown to be high. Studies 
reported in the experimental literature indi- 
cate that visual acuity scores are correlated 
at certain brightness levels. The closer the 
brightness levels tested, the higher has been 
the correlation reported. 

The relationship of acuity at photopic (day- 
light) and scotopic brightnesses has been in- 
vestigated in two studies. Uhlaner and 
Woods, 1951 (7), employing 200 subjects, re- 
ported correlations ranging from .19 to 39 
between various photopic acuity tests given 
at 10.02 log „uL. and scores on the Army 
Night Vision Tester given in the brightness 
range of 3.51 to 5.26 log ppl. Warden, 1944 
(8), however, found biserial correlations of 
only .02 between scores on the Navy Radium 
Plaque Adaptometer at 3.94 log ppL., and 
scores on a Snellen test given at standard pho- 
topic brightnesses. The restriction of range 
on the photopic variable may partially ex- 
plain the low correlation attained in this 
study. The 100 subjects tested all had pho- 
topic acuities of 20/20 or better. 

Two other studies have been concerned with 
the relationship of acuity scores taken at ad- 
jacent brightness levels in the photopic-meso- 
pic range. L. S. Rowland (6) compared acui- 
ties at 10 log ppL., 7.6 log ppl. and 6.5 log 
ppL. brightness levels, employing 56 subjects. 
The tetrachoric correlations between acuities 
at these levels (computed by the present au- 
thors) are: 10 vs. 7.6 log puL., 7 = .61, 10 vs. 
6.5 log pyL., r= 73, 7.6 vs. 6.5 log pul., r 
=.61. Feinberg and Wirt (2) found the in- 
tercorrelations of scores of far visual acuity, 

measured OO subjects on the Bausch and 
Loi Ditho-Rater checkerboard target, at 
hrightnesses ranging from “normal” to 1/33 
of “normal” to range from .71 to 90. Gener- 


224 


ally, the closer the levels compared, the higher 
the correlation attained. 

The present problem extends these analyses 
to a comparison between scotopic visual acuity 
and acuity at photopic and mesopic brightness 
levels. From the viewpoint of assessing the 
practicality of developing mesopic tests to 
measure scotopic acuity, the present study is 
crucial. The feasibility o. this approach 
would be demonstrated if indications of suf- 
ficiently high correlations could be shown be- 
tween scotopic and mesopic acuity, and if 
these correlations were substantially higher 
than those between scotopic and photopic 
acuity. 

Method 


Apparatus. The scotopic measurements were 
made on the Army Night Vision Tester (ANVT- 
R2X, 7). This instrument presents a black, two- 
degree Landolt Ring against a four-degree white 
background. The intensity of illumination is 
varied through eight steps of decreasing bright- 
ness, by placing filters over the self-luminous 
radium plaque background. Brightness varies in 


Uhlaner, Gordon, Woods, and Zeidner 


these steps between 5.26 and 3.51 log ##L. The 
subject is required to indicate which one of eight 
positions the break in the ring is facing. | Fight 
presentations of the stimulus are given in ran- 
dom order at each brightness level. 

The photopic and mesopic acuity measurements 
were made on wall charts and on the Bausch and 
Lomb Ortho-Rater instrument. All tests were 
conducted in the Pentagon Vision laboratory. 
This laboratory was standardized in conformity 
with specifications prescribed by the Armed 
Forces-National Research Council Vision Com- 
mittee. The layout at this laboratory is shown 
in Figures 1 and 2. 3 

The wall charts employed included the Modi- 
fied Landolt Ring, Army Snellen, Line Resolu- 
tion, and Quadrant Variable Contrast targets 
(Figure 3). Except for the Army Snellen, these 
charts were developed by the Personnel Research 
Section and were utilized in an earlier factor 
analysis study of photopic visual acuity. 

Photopic and mesopic acuity measurements 
were also made by means of the Ortho-Rater in- 
strument. The optical system of this instrument 
Presents the test target at an apparent distance 
of eight meters (1). In the present study, only 
the far visual acuity adjustment was employe¢: 
Control of the voltage input of the Ortho-Rater 


og a a | 


AIR CONDG SUPPLY 


AIR CONO'G 
SUPPLY 


EXISTING VENT 


& TEST ROOM 


20° 


a 


PHOTOPIC VISION TEST BOOTH 
SEE PRS REPORT #742 
FOR DIMENSIONS 


2s 


ORIENTATION 


CORRIDOR" G" 


SCALE: 1⁄4 INCH=1 FOOT 


Fic. 1. 


Layout of the Pentagon Vision Laboratory 
Y, 


ADAPTATION ROOM 


floor plan. 


i 


Visual Acuity at Brightness Levels 225 
ENTRANCE VIEW FRONT WALL VIEW 
= i C5 | [eT 
q 53 B | 
B 
| -FP FP 
c 
8 FT. B BFT 
TEL HR 
rau Wl 
Hs] 
pee E 
€ 10 FT. >? |e 10 FT. > 
LEGEND 
C - TEST CHART T - TABLE 
HR- HEAD REST SI - EXAMINEE'S GHAIR 
B - LIGHT BAFFLE S2- EXAMINER'S CHAIR 
FP- FAN PEDESTAL CC- CHART CHANGER 
L - LIGHT 
Fic, 2. Layout of the Pentagon Vision Laboratory, front view. 


Was maintained by means of a variac, a continu- 
ously variable resistance. The voltage was regis- 
tered on a voltmeter in parallel with the Ortho- 

ater and was periodically checked for deviations 
from normal. The variac was also employed to 
obtain the desired levels of photopic and mesopic 
brightness at which testing took place. A slight 
reddening of hue was found at the lower levels 
of illumination; this may have affected the sub- 
Jects’ responses. It is ‘assumed that the effect 
on responses of this change in hue was negligible. 

The Johns Hopkins acuity plates developed by 
Dr, Louise Sloan were employed as test targets 
ìn the Ortho-Rater. These targets consist of 15 
ines of block letters. The letters in each row 
are of equal size, equal width of stroke, and 
qual spaces between letters. The size of the 
letters and the width of stroke decrease for suc- 
ceeding rows from the top to, the bottom of the 


chart. There are five letters in the first row an 
ten letters in the remaining TOWS: The same 
for each row, 


€tters, arranged in different order 
Sia ee The letters range in size from 20 

ellen to 20/13 Snellen. 

The brightness levels of the Ortho-Rater were 
calibrated by use of the Macbeth Illuminometer 
and the Taylor Low Brightness Illuminometer. 
n using the Macbeth TIluminometer, the instru- 
Ment was sighted directly at a blank glass plate 
Set in the target positio Ortho 
The variac was adjusted until the plate equa e 
In brightness the pre-set level of the Iluminom: 
eter. In making this adjustment, the require 
tightness was corrected to compensate for gss 
of light at the eyepiece of the Ortho-Rater. The 


/200 


Taylor Low Brightness Illuminometer was sighted 
at the blank glass plate through the eyepiece of 
the Ortho-Rater. In all cases the variac settings 
were independently checked by several observers. 

Subjects. A total of 19 staff members of 
the Personnel Research Section were previously 
tested in December 1949 on the Army Night 
Vision Tester and were retested for this study in 
December 1950. Sixteen subjects from this group 
were used for one part of the analysis and 15 
were used for another part of the analysis. The 


MODIFIED LANDOLT RING ARMY SNELLEN 


LINE RESOLUTION 


QUADRANT VARIABLE CONTRAST 


Fic. 3. Wall chart tests used in the experiment. 
(The background of the quadrant variable contrast 
and line resolution tests was black.) 


226 Uhlaner, Gordon, Woods, and Zeidner 


subjects were not identical for all the tests. Sub- 
jects were selected to sample a wide range of 
scotopic acuity. 3 

Procedure. TeRi on (a) wall charts, (b) 
monocular Ortho-Rater plates, (c) first binocular 
plates, and (d) second binocular plates occurred 
in separate sessions. A month intervened be- 
tween (a) and (b), a week between (b) and (c) 
and a month separated (c) and (d). Scores on 
the Army Night Vision Tester had been obtained 
about a year prior to the commencement of the 
present study. All testing was conducted with 
corrected vision on those subjects who custom- 
arily wore glasses. The procedure involved in 
testing with wall and the Ortho-Rater plates will 
be described separately. 

Wall Charts. Each subject was dark adapted 
for 10 minutes in the testing room which was 
darkened to approximately .001 foot-lamberts 
brightness. This length of dark adaptation is 
sufficient to allow valid visual acuity testing to 
be carried out at the lowest brightness level uti- 
lized (Level 8). -The tests were observed bin- 
ocularly in the following order: Modified Landolt 
Ring, Army Snellen, Line Resolution, and Quad- 
rant Variable Contrast. Testing continued on 
each subject until he had made three consecutive 
errors. After the scores were recorded, the light 
level was adjusted to the next higher brightness 
level (Level 7); the subject was given an adapta- 
tion period ranging from 15 to 30 seconds, and 
testing again took place in the same order as in 
Level 8. This procedure was followed for the 
remaining six levels. The eight levels of illumi- 
nation employed are shown in foot-lambert and 
log ##L. See Table 1, 

Total time for each subject in each session was 
approximately 15 minutes. 

Ortho-Rater Tests. The procedure employed 
in administering the Ortho-Rater tests was simi- 
lar to that employed with the wall charts, except 
that only a single type of test target (letters) 
was administered at each brightness level. 

In the monocular testing, the subject was first 


tested with the right-eye target at the lowest 
level of illumination and then with the left-eye 
target at the same level of illumination. As in 
the wall-chart procedure, the level of illumination 
was raised to the next higher level and the sub- 
ject was given 15 to 30 seconds to adapt to the 
higher level. The subject was tested at this 
level with the right-cye target and then with the 
left-eye target. This procedure was repeated in 
the same manner for the remaining six levels of 
illumination. , 

Testing at light levels 1, 2, and 8 was omitted 
from the first binocular test. A preliminary 
analysis of the monocular data indicated that the 
targets available did not adequately discriminate 
between subjects at these levels. : 

All illumination levels were included in the 
second binocular test because a new target was 
employed. With the inclusion of this new target 
test, no inferences from the monocular data coul 
be drawn as was the case for the first binocular 
test. 


Results 


Wall Chart. The relationship Beier 
Scores on the scotopic test and on each 0 


the wall chart tests at the brightness levels , 


tested is given in Table 2. The Quadrant 
Variable Contrast test was discarded as it 
failed to differentiate between subjects. The 
items of this test were too difficult for the best 
subjects. Scores on the Army Night Vision 
Tester were number of correct responses 19 
the 64 presentations constituting the test- 
Scores on the wall charts were number correct 
to three consecutive errors, Rank order col 
relations are shown in Table 2. 

The smoothed scores were obtained by ft 
ting through each individual’s scores at te 
various brightness levels, a curve similar i? 


Table 1 
Levels of Illumination Employed 
Wall 
rece of all Chart Tests Ortho-Rater Tests 
=—_ Ft.-Lamberts Log uuL. FE Lamberts Log uu. 

2 ‘a 5 13.0 10.18 
3 5 > 3. 9.51 
a te) ag Be 8.96 
5 020 7.94 .070 7.85 
6 008 7.33 020 7.29 
7 ‘003 G98 009 6.96 
8 00 G31 ‘003 6.51 

= oe 001 6.03 


& 


P 


Visual Acuity at Brightness Levels 


N 
~x 


Table 2 


Correlations of the Army Night Vision Tester with Raw and Smoothed Wall Chart Test Scores 
N=15 


Mod. Landolt 


Army Snellen Line Resolution 


Level of - — 
Ilum a Raw Smoothed Raw Smoothed Raw Smoothed 
1 .58 -69 42 37 35 Ad 
2 61 -68 H Al AZ AT 
3 59 205 AT Al 53 62 
+ 69 .62 .56 58 AS -60 
5 -60 81 Er ST -65 -69 
6 .62 87 -61 ol 53 .62 
7 81 82 35 -63 -26* .58* 
8 aye 62* .63* -63* .04* —.01* 


_ * The relationships implied by these co 
which they are based showed inadequate d 
Significant at the 1 per cent level. 


shape to the function which seemed to repre- 
sent the relationship between acuity and 
brightness based upon the observations for 
15 subjects. These smoothed scores represent 
an attempt to get scores in which the error 
Variance is minimized. Similar logic is im- 
Plied in all methods of curve fitting. 
Ortho-Rater. The relationship between 
Scores on the scotopic test and the Ortho- 
Rater scores is given in Table 3 below. Scores 
on the Ortho-Rater are based on the number 
of tights to three consecutive errors. Best 
Eye “A” is defined as scores on the eye which 
Save best acuity on the majority of brightness 
levels tested. Best Eye “B” is defined as 
Scores on the eye which gave best acuity at 


relations must be accepted with reservation as the mesopic tests upon 
lifferentiation and variance at these levels. 


Correlations of .51 are 


each level. For the first binocular target, 
testing was carried out only for Levels 3 
through 7. 


Discussion 


Relationship of Scotopic and Higher Bright- 
ness Scores. A trend is found for higher cor- 
relations to occur between the wall charts and 
the Army Night Vision Tester at the lower 
brightness levels (Table 2). Highest cor- 
relation (raw) with the Army Night Vision 
Tester occurred at Level 7, 6.51 log ppL., for 
the Modified Landolt Ring, at Level 6, 6.94 
log ppL. for the Army Snellen, and at Level 
5, 7.33 log „uL. for the Line Resolution test. 
The correlations of the Ortho-Rater with the 


Table 3 


Army Night Vision T 


ester with Ortho-Rater Test Scores 


Correlations of the od 
9 First Second 
Level of Best Eye Best Dye ptr “Second 
uA? 
Illumination A' = = 2 
> 2 59 = = 
3 En a ái 
‘ = 75 54 rs 
3 > 37 40 = 
2 A m 43 
6 2 2 E s 
7 * k 43 s 
8 * 


wot jects wa! 
nige, Padequate differentiation of subje 
cant at 5 per cent level. 


shown by the tests at these levels. Correlations of .50 are sig- 
s 


228 


Army Night Vision Tester show an increase 
as brightness levels are increased to Level 4 
(7.85 log ppL.) and a decrease at higher 
brightness levels. Highest correlations with 
the ANVT-R2X are obtained at this level for 
Best Eye “A” and “B” and for first binocular 
scores. The second binocular scores show 
highest correlation at Level 6. 

It might reasonably have been expected 
that scores on the Army Night Vision Tester 
would correlate most highly with tests ad- 
ministered at the lower brightness levels, i.e., 
Levels 7 and 8. Failure to obtain this result 
here may perhaps be explained by the un- 
suitability of the wall charts and Ortho-Rater 
targets employed for testing at the lower 
brightness levels. Correlations appear to in- 
crease up to the point where these targets can- 
not be seen by the subjects. 

The alley charts correlate more highly with 
the Army Night Vision Tester than do the 
Ortho-Rater plates (Tables 2 and 3). This 
result may be attributed to the superior acuity 
distributions at the low brightness levels ob- 
tained on the charts. It should be noted that 
one of the wall charts used the Landolt broken 
ring design which is similar to the target used 
in the Army Night Vision Tester. The 
specificity of the Landolt target may have 
increased the correlations. Further study 
should be made to determine whether or not 
the ring gives high correlations with scotopic 
tests of other designs. 

These results would raise doubt concerning 
the allegation that scotopic visual acuity 
scores are too unstable to permit their long- 
term prediction. In the present study, the 
Army Night Vision Tester was administered 
to the subjects a full year before the photopic 
tests. Despite this time difference, correla- 
tions of .60 and higher are found between the 
Army Night Vision Tester and the mesopic 
tests. 


Summary 
The aim of this study was to d 
correlations among scores on a gc 
acuity test and scores on wall 
Ortho-Rater plates administered at various 
photopic and mesopic brightness levels. Nine- 
teen subjects were employed, selected to show 


etermine the 
otopic visual 
charts and 


Uhlaner, Gordon, Woods, and Zeidner 


a wide range of scotopic acuity scores. The 
correlations obtained are considered only as 
indications of relationships due to the, small 
number of subjects employed. Scotopic acuity 
was measured with the Army Night Vision 
Tester (ANVT-R2X). Brightnesses ranged 
from 3.51 to 5.26 log ppl. Mesopic and 
photopic acuities were measured with various 
wall chart tests and targets used in a modi- 
fied Ortho-Rater instrument. Brightness 
levels ranged from 6.03 to 10.60 log wile 
The main findings of this study are as follows: 

1. Scotopic acuity scores showed moderate 
Positive correlations with the photopic acuity 
scores, and higher correlations with mesop!¢ 
acuity scores, both for the wall chart tests an 
the Ortho-Rater plates. 

2. The Landolt Ring acuity target ahs 
higher correlations with the Army Night Vr 
sion Tester than do the other targets. It i 
not possible to state whether this result it 
due to similarity of design of the Landon 
Ring and the Night Vision Tester, or to oe 
intrinsic factor of the design itself. In. i 
ture developmental work on a test of nig d 
vision ability, this target should be include 
as one of the mesopic targets. . 

3. High correlations with mesopic aan 
were obtained in the present study, ia 
though the scotopic test was administered, i 
the subjects a full year before administrate” 
of the photopic and mesopic tests. This ie 
ing should raise doubts concerning the G 
that scotopic visual acuity scores are too He 
stable to permit their long-term prediction- jie 

4. Asa consequence of 1 and 3 above, g 
practicability of developing a mesopic test s 
night vision ability is indicated, Such a te 
would have the following advantages ove z 
Scotopic test: shorter adaptation time (penr 
more rapid testing), less expensive and a 
bersome equipment, less dependence on lig, 


f 5 ag 
light testing conditions, and fewer test”? 
Personnel, 


Received June 30, 1952, 


References 


„ual 
S—NRC Vision Committee, aie 
‘uctions: Armed Services Vision 
ersity of Michigan, April 1951. 


1. Armed Force: 
of Instruc 
nivi 


"~J 


on 


. Lythgoe, 


Visual Acuity at 


. Feinberg, R., and Wirt, S. E. Visual acuity in 


relation to illumination in the Ortho-Rater. 
J. appl. Psychol., 1947, 31, 406-412. 


. Field Artillery School, Report on study of night 


vision. Fort Sill, Oklahoma, February 1944. 


. Hecht, S. Relation between visual acuity and 


illumination. J. gen. Physiol, 1928, 11, 255. 
R. J. The measurement of visual 
acuity. Medical Research Council, Special Re- 
port No. 173, London, His Majesty’s Station- 
ery Office, 1932. 


. Rowland, L. S. Night visual efficiency in illumi- 


nations above the level of the cone threshold. 


Brightness Levels 


~ 


n 


v 


10. 


. Uhlaner, J. E., and Woods, I. A. 


. Warden, C. J. 


229 


U. S. AAF School of Aviation Medicine, Ran- 

dolph Field, May 31, 1944 (3551). 

Studies in 

night visibility. Highway Research Board, 

Bulletin No. 43, 1951. 

An investigation oj motion acuity 
under scotopic conditions at various retinal 
positions. U. S. NRC-CAM, April 1944, Re- 
port No. 326. 

. Personnel Research Section, A.G.O. Report 742. 
Studies in visual acuity, 1948. 

. Personnel Research Section, A.G.O. Report 816. 
Validation of Army Night Vision Tester. 30 
January 1950. 


Tue JOURNAL OF APPLIED PSYCHOLOGY 
Vol. 37, No. 3, 1953 


The Influence of Increased Positive g on Reaching Movements ' 


A. A. Canfield 


Wayne University 


A. L. Comrey 
University of California at Los Angeles 


The pilot of modern high-speed aircraft is 
faced with many stress situations that were 
unknown to his predecessors, such as the ex- 
treme radial accelerative forces developed 
when an airplane is maneuvered through a 
change in direction, as in turns and pull-outs 
from dives. Popular literature is rich with 
stories of pilots blacking out (a temporary 
loss of vision due to decreased blood supply 
to the eye), suffering sudden displacements 
of the lower intestines, bleeding at the mouth 
and ears, etc. Other than the occurrence of 
blackout and unconsciousness, the latter con- 
comitants of these forces apparently occur 
rarely, if ever, in practice (1). 

Physiologists and medical research spe- 
cialists, together with engineers, have de- 
veloped protective clothing called g-suits in 
an effort to counteract these radial forces. 
While these efforts have been successful in 
elevating the tolerance threshold somewhat, 
techniques have not been developed to com- 
pensate for the tremendous increased effec- 
tive weight of the body under these increased 
accelerative conditions. A person exposed to 
a 5 g accelerative force, by definition, has an 
effective weight equal to five times his normal 
weight. Woods et al. (2) at the Mayo Clinic 
centrifuge have shown that it is impossible 
for a man to rise from his seat under condi- 
tions of 5 g. In addition to the problem of 
general body movement, the increased weight 
also introduces problems in moving the ex- 
tremities, as the arms and legs weigh equiva- 
lently more. This introduces serious prob- 
lems for the pilot when he attempts to reach 


ori-77 Task 
dissertation 


3 = k . arren super- 
vised the research and his kind help and 
gratefully acknowledged. p pause Fare 


230 


R. C. Wilson 


University of Southern California 


and 


for and/or manipulate controls under condi- 
tions of radial acceleration. 

While the effect of these radial accelera- 
tive forces on effective body weight is the 
same irrespective of the direction from which 
the force is imposed, markedly different physi- 
ological effects are associated with them, 
When the force is applied along the vertica 
axis of the body from head to seat, blood 
tends to pool in the abdominal cistern and 
the lower extremities. This is the commonly 
experienced positive g and is the type studie 
in this paper. When the direction of force 
application is reversed, blood pools in the 
head, and this is called negative g. When ta 
force is applied at right angles to the vertica 
axis of the body it is called transverse g, an 
has generally less serious effects. ‘The toler- 
ance to transverse g is very high (partially 
accounting for experiments on the prone pos!” 
tion for high-speed aircraft pilots), next high- 
est for positive g, and low for negative g- 

This research was conducted for the pur 
Pose of evaluating the effects of positive £ 
forces on the speed and accuracy of ballistic 
reaching movements of the arm, All of thé 
research data were collected on 48 volunteet, 
but paid, Ss on the human centrifuge locate” 
on the University of Southern California ca™ 
Pus. Each S had passed a rigorous physi a 
examination before being allowed to partic 
pate in the study. 


Experimental Procedure 


; r 

The 48 Ss were randomly divided into, a 
SToUps of 12 each. Each group was subjee g 
© three different g conditions: 1 g, 3 8, and 


ir 
All made a ballistic reaching movement with the? 
hand to a 


$ target approximately 5” square at, 
distance of 19” E the dentine point. Ths 
Was a switch on the end of a metal tube Bice 
Projected toward them at shoulder height and f 
the midline of the body. From this point me 
reached at an angle of 35° to the target in © 


Influence of Increased Positive g on Reaching Movements 


of four positions—up, down, left, and right. The 
target face was at right angles to the path of 
movement for all four target positions. The 
whole target area was well within the maximum 
a area of the arm as described by Barnes 

The switch at the starting point closed a cir- 
cuit on a standard timer when S removed his 
hand. Another micro-switch was placed behind 
a rubber diaphragm which served as the backing 
of the target. When S hit the target surface 
with his finger, this switch automatically opened 
„the circuit and stopped the timer. Another clock 
in the circuit started when the starting buzzer 
sounded and stopped when he hit the target. It 
was thus possible to derive the following three 
time scores: the time taken to start the move- 
ment, called reaction time in this study, the time 
taken to make the movement, and the total time 
which elapsed from the sounding of the buzzer 
to the completion of the movement. 

The face of the target was covered with a 
sheet of polar coordinate graph paper scribed in 
intervals of 1 tenthinch. The S’s preferred finger 
Was covered with a metal cot that terminated in 
a pin-like point. As the target was struck this 
point punctured the polar coordinate target sheet. 
These points indicated the exact location of the 
strikes. The strikes on the target were consid- 
ered from two standpoints—the quadrant of the 
target in which they fell, regardless of the size of 
target center disparity; and the distance from 
the center of the target, direction disregarded. 

A number of different types of scores were 
available for comparing S’s performance at the 
different g-levels and target positions such as re- 
action time, movement time, total time, direction 
of error, magnitude of error, and the relation be- 
tween the times and the accuracy of the move- 
ments, 5 

Before starting the test trials, each S was given 
two indoctrination rides on the centrifuge includ- 
ing a ride at 5 g. If S desired to continue, his 
experimental trials were begun. On the first ex- 
Perimental day, each S spent about fifteen min- 
utes making movements to the target in the po- 
Sition he would encounter on that day. Each S 
Was also trained to detect the difference between 
the warning and the reaction buzzers (ienne 
in pitch) and was shown how his responses wou 
e evaluated in the experiment. All Ss were in- 
Structed to make the movement as quickly an 
accurately as possible, and to strike as near the 
center of the target as they could. During ae 
first day’s practice, care was taken to assure tha 
S make a ballistic movement (4), and not a mov- 
Ing fixation. ; 

Each of the four sub-groups of re 12. ps n 
cach, had diferent sron 
Sition, Within the framew ov each other 


all positinne horde and followed cw 
ij fens Hic el 


231 


order was the same for all subjects in any one 
group, the order of imposed force for members 
of the group was systematically varied. As a 
result each g level and target position preceded 
and followed each other an equal number of 
times. These precautions were taken to avoid 
any experimental error that might result from 
the serial effects of either g or target position. 

Each S had two experimental days following 
the first day of practice. The target was placed 
in two of the four positions of each of these days, 
and S made four consecutive reaching movements 
for the target at each of the three positive g 
conditions used in the experimental—l, 3, 5 (1 g 
is normal gravitational force, and does not in- 
volve centrifuge rotation). 

The data of the experiment were 192 move- 
ments (4 each for 48 Ss) made to each target po- 
sition for each of the three different g levels. 
Only the target position and the radial force im- 
posed were known to vary systematically. 


Results 


Direction of Error. The observed distribu- 
tion of the responses in four quadrants of the 
target demonstrated striking changes as the g 
level increased, but the nature of the change 
varied with target position. Figure 1 shows 
the number of responses which fell in each 
of the quadrants for each of the four target 
positions at the three ¢ levels. 

An examination of Figure 1 shows that at 
1 g the movements made upward, to the left 
and to the right tended to fall in the upper 
half of the target, and the responses down- 
ward fell about equally in the upper and 
lower halves. The figure also shows that the 
responses tended to fall on the right side of 
the target when it was in the “up” and “down” 
positions, on the left side when in the “left” 
position, and on the right side when in the 
“right” position. As the g level increases, 
however, the responses moved to the lower 
half of the target when in the “up,” “left,” 
and “right” positions and the upper half 
when the target was in the “down” position. 
Similarly, the responses shifted to the right 
half of the target when it was in the “up,” and 
“left” positions and to the left half when in 
the “down” and “right” positions. 

Table 1 shows the results of Chi Square 
tests of the distribution of responses between 
the g levels for each target position. 

Of the 24 values, 16 are significant beyond 
the 1% level of significance. and in all but 


232 Canfield, Comrey, and Wilson 
3g `g 
: 5 23 | 48 
48 | 73 | 121 20 | 41 | 61 25 | 23 | 
UP | | 9 | 95 | 144 
7 36 | 95 | 131 9 5 
33 | 38 | 71 36 | 95 | "o a 
St 11 92 56 136 192 7 
1g 3g 5g . 
60 47 | 107 
26 63 | 89 42 | 38 | 80 ) | 
DOWN = 7 i 5 
26 | 77 | 103 54 | 58 | 112 48 | 37 | 85 
“52 140 9 96 108 $4 
1g 3g 5g 
5 59 
82 45 | 127 45 | 29 | 74 24 | 35 | 5 
LEFT | — sie 7 
31 | 34 | 65 56 | 62 | 118 52 | st | 1% 
mo w 101 91 76 116 
1g 3g 5g 
1 
38 | 68 |106 58 | 45 | 103 37 | 24 | 6 
RIGHT pa ie 
43 | 43 | 86 52 | 37 | 89 m | 57 |1 
ae pe 22) 3 ae 
81 oam 10 82 111 81 


Fic. 1. 


six instances they are significant beyond the 
5% level. The responses clearly tend to move 
to the nearer and lower quadrants of the tar- 
get as the g level increases. 

Response Accuracy. In all of the four tar- 
get positions the accuracy of the movement 
was severely impaired by the higher accelera- 
tive forces. Table 2 shows the circular errors 
for the various g levels and target positions. 

In all cases the magnitude of the error of 
movement was larger (significant beyond the 
1% level of confidence) at 5 g and 3 g than 
it was at 1 g. The increase between 3 g and 
5 g, however, was not significant for either the 
“down” position or “right” positions. 

The accuracy of movements to the left was 
significantly Poorer at all g levels than those 


Frequency of responses in the various target quadrants by target position and g level. 


s 
made downward and to the right.? It sr 
also significantly poorer than upward cs if 
ments except at 5 g where no age r 
ference was found between the two. ard 
ments to the right, upward, and darom i- 
were not significantly different in mer ai 
curacy at the 1 g level, but both MON 
to the right and down were significantly in- 
accurate than reaching upward at the 
creased g levels, he 

In general, movements to the left were t 
least accurate at all g levels, with movem! sig 
into the other three planes showing paee 
nificant difference under normal tondita 
Movements to the right and down, how 


* Forty 


d 
n han 
-seven of the 48 Ss preferred the right 
for maki 


ng this type of movement. 


Influence of I ncreased Positive g on Reaching Movements 233 
Table 1 
Chi Square Values from a Comparison of the Obtained Left-Right and Up-Down Splits in 
Target Strikes at the Various g Levels with Each Other* 
No. of responses = 192 
Target Position 
Up Down Left Right 
Levels u-d l-r u-d l-r u-d l-r u-d l-r 
1-3 80.45 13.35 1.70 51.06 65.33 3.09 0.18 17.97 
1-5 148.45 1.26 6.79 82.70 107.54 29.44 42.64 19.22 
3-5 4.06 8.17 15.62 3.00 4.95 13.06 36.95 0.02 


* Chi? of 3.84 significant at the 5% 


are a great deal more accurate than upward 
movements at the increased g levels. Reach- 
ing to the right is somewhat more accurate 
than reaching down at 3 g, but no significant 
difference was found at 1 g or 5g. 
Movement Time. The time required to 
complete the ballistic movement increased 
markedly as the g level increased for move- 
ments upward and to the left, but was less 
seriously impaired for movements downward 
and to the right. Table 3 shows the move- 
ment times with the target in the four posi- 
tions and for each of the g levels. 
The differences in the movemen 
significantly higher (beyond the 1% level) 
for each succeeding g condition when the tar- 
get is in the “up” position. The time required 
for the movement is similarly, though not as 
seriously, impaired for movements to the left. 


t time are 


Table 2 
Means, Standard Deviations, and Standard Error of 
the Means of the Circular Error Scores’ 


No. of subjects = "$ 
£ Level = 
eee eee p 
Target 1g Je ees 
Position M SD. M SD. M - 
MG an a 838 36 96 372 
Lon 4.61 223 584 288 A 4.57 
Rie 573 261 697 284 379 3.82 
iht a33 1907.18 297 
j in tenths 
of * All values resented in this table are hich these 
Vv poches. Each of the 48 scores rom erage error 
sce ues were computed represented the al and target 
poses! four Pessoa made at the § ue 
Sition indicated. 


level of confidence; Chi? of 6.64 significant at the 1% level of confidence. 


Movements to the downward direction did not 
show any significant increase in time, and 
movements to the right were not consistently 
impaired. 

Both the movements made downward and to 
the right were faster at the increased g levels 
than those made upward and to the left. 

` These differences are all significant beyond 
the 1% level of significance except the down- 
left comparison at 3 g which is only significant 
at the 5% level. The only significant differ- 
ence between the speed of movement at 1 g 
in the various directions was that movements 
to the right were faster (at the 5% level) 
than those made upward. No significant dif- 
ferences were found between the speed of 
reaching to the right or downward. 

Reaction Time. Previous research (5) has 
indicated that simple reaction time to both 
sound and light stimuli increases significantly 


Table 3 


Means, Standard Deviations, and Standard Error of 
the Means of the Movement Times* 
No. of subjects = 48 
g Level 
1g 3g 5g 

Target a ; 

e uo “om Mso 
Up 135.295 1.50 484 2.20 .620 
Down 131 337 1.27 .378 1.31 406 
Left 133 306 138 .438 159 .505 
Right 1.28 -323 1.23 .359 1.35 .397 


n olo 


* All values are given in seconds. Each of the 48 
red from which these values were computed repre- 
d the total time taken for four response movements 
g level and target position indicated. 


234 


Table 4 


Phi Coefficients Between Circular Error and Movement 
Time for the Target Positions by g Level 


No. of subjects = 48 


Target Position 


g Level Up Down Left Right 
1 —.169 —416** —.248  —.416** 
3 — 374" — 300 —.248  —.500** 
5 -208 374 .332* -627** 


* Significant at the 5% level of confidence. 
** Significant at the 1% level of confidence. 


at increased g levels. The term reaction time 
as used in this study of reaching movements 
should not be interpreted as a measure of the 
maximum speed of reaction, but should be 
considered as a preparation period prior to 
instigating a movement. This period was 
significantly longer (beyond the 1% level) 
for all target positions at 5 g than it was at 
1 g, but did not differ significantly between 
any of the target positions, or correlate sig- 
nificantly with either the accuracy or speed 
of the movement which followed. 

Relation between Movement Speed and Ac- 
curacy. It was consistently found that longer 
movement times were associated with greater 
accuracy. Table 4 shows the Phi coefficients 
derived from the intercorrelation of each S’s 
movement time and accuracy scores at each g 
level and target position. The significance 
of the Phi coefficients were determined through 
converted Chi Square values. 

Inasmuch as speed of movement and 
amount of error were found to be negatively 
related, it must follow that since both speed 
and accuracy were impaired under increased 
g conditions, the accuracy would be even 
further impaired if the same movement time 
were achieved, and vice versa, 


Interpretation 


Direction of Error. 
reaching movements ten 
different quadrants of 
level increased is attrib 
sources. The first of t 
“experimental error.” 


The fact that the 
ded to terminate in 
the target as the g 
uted to two different 
hese might be termed 


Canfield, Comrey, and Wilson 


Under increased g conditions, the first re- 
sponse of the Ss to the target would quite fre- 
quently fall far below the center of the target. 
If the trial which followed was a 1 g trial, 
the first movement was frequently quite high. 
Both of these types of errors on the first 
movement (too low at the increased g condi- 
tions and too high at the normal g condition) 
were often accompanied by exclamations of 
surprise. The first movement at 1 g was fre- 
quently made in response to the pattern of 
kinesthetic cues that had been used for mov- 
ing the arm under the previous atypical weight 
conditions. This introduced a source of Tee 
sponse error that is difficult to judge, but if 
recognized as a systematic source of error, 
admits that the error was due to the condi- 
tions of the experiment and does not detract 
from the meaningfulness of the results inso- 
far as the effect of increased gon movement 
is concerned. 

Second, the accumulation of strikes on the 
lower and nearer sections of the target sug- 
gests that two different types of movement 
errors occurred. First, the observance of re- 
Sponses on the near side of the target suggests 
that the initially applied force was insufficient 
to carry the arm to the intended termination 
point, and second, the tendency for the strikes 
to accumulate on the lower half of the target 
is attributed to an error in judging the re 
quired trajectory of movement, Following 
the terminology of Brown et al. (6), the first 
of these has been called the “negative inertia 
error,” and the second has been termed the 

error of downward tendency.” As a cai 
sequence, one would expect the strikes to ie 
in the lower half of the target in the “uP 

position as both errors are acting in the a 
direction. In addition, since the types id 
errors are additive in this position, one wou 

anticipate movements to this position tO ify 


the least accurate of er 
š all. The results V 
this deduction. the 


target in the left 


expected to accu 
section. 


target in 
to offset 
error tendi: 
half of the 


Similarly, the response tO 
and right positions would r 
mulate on the lower and oT 
This is what occurred. With t 
the down position, the errors = 
each other, the negative ine? a 
ng to make them fall on the upe 
target, and the error of downwW@ 


SS ee 


Influence of Increased Positive g on Reaching Movements 235 


tendency tending to make them fall on the 
lower half of the target. The results have 
shown that the responses fell on the upper 
half at 5 g indicating that the negative inertia 
error was the more predominant of the two. 

Response Accuracy. The increase in the 
error of the movement is attributed to the in- 
adequacy of the normal kinesthetic cues under 
the increased g conditions. Possibilities of 
reduced visual acuity at the 3 and 5 g con- 
ditions are highly unlikely in view of previous 
research findings on perceptual speed ability 
at these same g levels (7) and the fact that 
no S ever reported gray-out, the preliminary 
symptoms of blackout. 

Movement Time. The increase in move- 
ment time is attributed to the failure of the 
Ss to throw the arm with sufficient force to 
compensate for its increased effective weight. 
Despite the fact that movements, made at 
the increased g levels were in the main shorter 
(responses falling on the near side of the 
target) they took longer. The difference in 
time is hardly within that which might include 
a shift from a ballistic to a moving fixation 
movement, and the error pattern reflects no 
such alteration in method of arm movement. 

Reaction Time. The increase in reaction 
time, as defined in this study, is attributed to 
an increased cogitation period before start- 
ing the reaching movement. After the first 
movement at increased g, Ss were immediately 
aware of the fact that this was a different 
situation, calling for a different arm thrust. 
The increase in time between the warning 
buzzer and the start of the movement is con- 
sidered an increase in the readiness period 
taken by the Ss to get better “set” for the 


ensuing movement. 


Summary 


From the results of this research, certain 
conclusions about the effect of increased posi- 
tive radial acceleration on reaching move- 
ments may be advanced. 

1. Both the speed and accuracy of reach- 
ing movements at increased g levels are serl- 
ously impaired, the degree of impairment 


being roughly equivalent to the amount of 
force imposed. . 

2. The kinesthetic cues governing the 
thrust of the arm under normal circumstances 
are inadequate to maintain similar accuracy 
or speed under radial accelerative conditions. 

3. Due to the increased weight of the arm 
and the inadequacy of the normal kinesthetic 
cues, two types of errors are found, one being 
the negative inertia error and the other the 
error of downward tendency. 

4. The most favorable location of controls 
for the pilot of high-speed aircraft, both from 
the standpoint of speed and accuracy, is to 
the side of the pilot’s preferred hand and be- 
low its normal resting point. 

5. Emergency controls that might have to 
be manipulated under conditions of increased 
positive radial acceleration should be no 
smaller than two inches in diameter if a push- 
ing motion is required. 


Received June 23, 1952. 


References 


1. Wood, E. H., Lambert, E. H., and Code, C. F. 
Do permanent effects result from repeated 
blackouts caused by positive acceleration? J. 
Aviat. Med., 1947, 18, 471-481. 

2. Code, C. P., Wood, E. H., and Lambert, E. H. 
The limiting effect of centripetal acceleration 
on man’s ability to move. J. Aero. Sci., 1947, 
14, 117-123. 

3. Barnes, R. M. Work methods manual. 
York: Wiley and Sons, Inc., 1944. 

4. Hartson, L. D. Analysis of skilled movements. 
Personnel J., 1932, 11, 28-43. 

5. Canfield, A. A., Comrey, A. L., and Wilson, R. C. 
A study of reaction time to light and sound 
as related to increased positive radial accelera- 
tion. J. Aviat. Med., 1949, 20, No. 5. 

6. Brown, C. W., Ghiselli, E. E., Jarrett, R. F., and 
Mimium, E.W. Speed and accuracy of spatial 
location in the prone position. Aero. Med. 
Laboratory, Eng. Div., Air Materiel Com- 
mand, Wright Field, Dayton, O. Serial No. 
MCREXD—694—4H, E. O., 1948, 694-17. 

. Canfield, A. A., Comrey, A. L., Wilson, R. C., and 
Zimmerman, W. S. The effect of increased 
positive radial acceleration upon human abili- 
ties (Part II: Perceptual speed ability). U. of 
S. Calif., Dept. of Psychol., Office of Naval 
Res. Contract N6 ori 77, Task Order III, Re- 
port No. 4, 1948. 


New 


` 


THE JOURNAL OF APPLIED PSYCHOLOGY 
Vol. 37, No. 3, 1953 


Applied Psychology in Action 


Editors Note: An announcement of this 
new feature of J. appl. Psychol., including a 
plea for psychologists in business and indus- 
try to send in suitable material was sent to 
about 40 psychologists on the firing line early 
in February, 1953. As of the beginning of 
April, not a single response had been received. 


Job Supervision of 


The following is extracted from a Report 
of the Technical Committee on Supervision 
of Young Workers, Bureau of Labor Stand- 
ards, U. S. Department of Labor, February, 
1953, composed of: Chairman, Mrs. Margaret 
F. Ackroyd, Chief, Division of Women and 
Children, Department of Labor, Providence, 
Rhode Island; Fanny G. Buss, Standard Oil 
Company, Cleveland, Ohio; Mrs. Mary 
Cooper, Hutzler Brothers Restaurant, Balti- 
more, Maryland; Jane F. Culbert, Vocational 
Advisory Service, New York City; Gilbert 
David, The Prudential Insurance Company, 
Newark, New Jersey; James Forster, DeKalb 
Agricultural Association, Inc., DeKalb, Illi- 
nois; Harry Gladstine, The Washington Post, 
Washington, D. C.; Dr. Dale Harris, Insti- 
tute of Child Welfare, University of Minne- 
sota; Mrs. Bernice Heffner, American Federa- 
tion of Government Employees, Washington, 
D. C.; Kathryn-Lee Keep, Department of 
Labor and Industry, Erie, Pennsylvania; R. 
Bruce Neill, James Monroe High School, 
Fredericksburg, Virginia; Clyde L. Schwy- 
hart, Caterpillar Tractor Company, Peoria, 
Illinois; Thomas E. Walsh, Amalgamated 
Clothing Workers of America, Troy, New 
York; Benjamin C. Willis, Superintendent of 
Schools, Buffalo, New York; and Mrs. Ger- 
trude Folks Zimand, National Child Labor 
Committee, New York City. 

“What Should the Supervisor Know About 
Youth? The Committee believed that the 
core of the task of getting better supervision 
of young workers is to help supervisors of 
youth to be more interested in and better 
understand the basic characteristics of youth 
—their capabilities, their problems, their at- 


236 


As noted in the April issue, this new feature 
will be continued only long enough to de- 
termine whether or not our readers desire 
such a section and whether or not psycholo- 
gists will take the time and trouble to submit 
suitable copy. 


Young Workers 


titudes, and their needs. The responsibilities 
and the attitudes of work supervisors neces- 
sarily center largely about a concept of work- 
ers as adults. Nine-tenths of the Nations 
workforce is indeed past 20. Yet almost 
every industry and business has at least & 
small component of young beginners. That 
youth are not yet adult is a truism, but t00 
often not fully understood by the man oF 
woman whose responsibility it is to help 
youth give good work performance and grow 
up to be good workers, ngs 

In attempting to define the characteristics 
of youth of significance to a work supervisor 
attention was focused upon youth of about 
14 to 18 years of age. The Committee be- 
lieved, however, that a description of this 
midadolescent group would be useful in un- 
derstanding older youth on the job as well. 
The Committee was exceedingly grateful t° 
its member, Dr. Dale Harris, for preparing 
in advance of the meeting an analysis of the 
characteristics of youth in their midadoles 
cence with special reference to those charac 
teristics likely to be significant in job situa 
tions, Bringing the practical job experienc? 
of various Committee members to bear on oo 
Harris’ contribution, the Committee develope 
the following description: 

Youth—A Period of Adjustment, The SY 
Pervisor must first of all realize that adoles” 
cent boys and girls are in transition fro™ 
childhood to adulthood, and that this sta&° 
in a person’s development may be a diffen. 
Period of personal adjustment. This tran" 
tional stage has no precise age limits, but ’ 
defined by Psychologists as beginning 
roughly 12 to 14 years and continuing t° 


gh: a 


a 


Applied Psychology in Action 


or 22 years. The midadolescent period of 
about 14 to 18 years of age is normally the 
period of greatest stress. 

The major areas of adjustment that are the 
primary concerns of teen-agers are usually 
considered to center about four factors: (1) 
how to be attractive to the opposite sex, (2) 
problems of family relations which result 
from their attempts to emancipate themselves 
from parental control, (3) for those still in 
school, anxiety over school achievements, (4) 
concern with vocational plans—though this 
may often be vague and unrealistic. 

Of great significance to those who super- 
vise the early work experience of adolescents 
is the youth’s concern to be considered some- 
body, a person of importance to himself and 
others, with a place in the world and a con- 
tribution to make. ; , 

Of equal significance to supervisors 1S 
youth’s own insecurity about the emerging 
responsibilities and challenges of adulthood 
and how to act to realize them. They do not 
like to admit these insecurities, but they are 
nevertheless there. Young people therefore 
reach out for security which in large part 
they attempt to find by tying closely to a 
group of their own age. They seek the ap- 
proval of that group, and conformance to its 
standards becomes very important to them. 

In the adolescent’s desire to be grown-up, 
he sometimes has diffculty in accepting the 
authority of adults. This adolescent ‘revolt 
expresses itself in various ways—including 
the display of immoderate behavior, language 
or dress, 

Basic Characteristics. ; 
a tremendous range of differences among 10- 
dividuals at adolescence as at any 48% No 
two are indeed alike. However, the basic 
characteristics of the adolescent stag? of de- 
velopment which the supervisor of yours 
workers needs to be aware of, can in genera 
be described as follows: 

Physical Maturity. 


There is of course, 


Most girls will be 

; d 
Physically matured by the age of m goni 
many boys are still immature at oe se- 
Consequently girls are much more i dy : 
appear mature and socially poised than boy: 


their own age. 


237 


Strength is closely related to physical ma- 
turity. Boys gain 50 per cent in actual 
muscle volume during adolescence, girls much 
less. Sexually immature individuals at this 
age will lack considerably in strength and 
endurance. There are great differences in 
strength between physically mature and im- 
mature boys of the same ages. 

Adolescents can mobilize much energy on 
demand, but not all youth are able to main- 
tain sustained output. Furthermore, growth 
in muscle volume may lag behind growth in 
stature; a youth may not be as strong as he 
first appears. 

Many adolescents of this age have yet to 
learn how to achieve a balance between physi- 
cal needs for rest on the one hand, and social 
interest and needs on the other. 

Physical health is good; the period is char- 
acterized by little illness. 

Basic motor skills, such as speed of move- 
ment, reaction time, and coordination are 
fully developed although not necessarily fully 
trained. 

Intellectual Maturity. Intellectual stature 
has just about been reached; measurable in- 
crements of intelligence after fifteen are much 
less significant than those which occur from 
ages ten to fifteen. Many older adults fail 
to recognize that the average adolescent of 
sixteen and seventeen has achieved sharpness 
of intellect and a heightened readiness to 
learn. 

Even though he may be ‘bright,’ the adoles- 
cent is limited in ‘judgment.’ Though he 
lacks experience, he resents being talked down 
to and being considered unable to solve prob- 
lems. 

The adolescent’s ability to think abstractly 
is well developed, which leads him to seek 
reasons based on principle. This sometimes 
makes him appear argumentative. 

Adolescents are able to evaluate their own 
behavior, and actually they engage in a great 
deal of self-criticism. They are often quite 
sensitive to blame, though they may not seem- 
ingly admit failure when criticized. They 
may be easily discouraged. 

Many adolescents exhibit a great deal of 
intellectual and emotional ‘questing’-—a vague 


238 


longing for something unknown. This makes 
them receptive to emotional appeals to loyalty, 
integrity, self-sacrifice. 

Interests and Attitudes. The day-to-day 
behavior of the midadolescent will show con- 
siderable vacillation between the developing 
interests of maturity and the interests of the 
younger child, though the adolescent himself 
frequently will reject the less mature interests 
after returning to them temporarily. Fre- 
quently adults see this as unpredictability and 
unreliability. 

Many adolescents are characterized by con- 
siderable idealism and a sense of altruism, 
and at the same time by snobbishness and a 
feeling of superiority. 

The critical capacities of the adolescent 
may extend to others, so that he appears to 
be highly intolerant. Combined with ideal- 
ism, this characteristic leads him to seek per- 
fection in adults, and he may feel let down 
when they fail to measure up to his expecta- 
tions. These attitudes may carry over into 
his relationships with his work supervisor. 

Adolescents frequently have a strong de- 
sire to do well, and to get ahead. Although 


Personnel Psychology 


The work of Personnel Psychologist George 
M. Hill at the Armco Steel Corporation, Mid- 
dletown, Ohio, was featured in the February 


Applied Psychology in Action 


they are somewhat vague about specific goals 
in life they want ‘to know where they are 
going.’ 

Social Behavior. Much of the social be- 
havior of adolescents is characterized by ape 
parent contradictions which upon closer in- 
vestigation are found to be more apparent 
than real. , 

There is a strong desire to be treated as in- 
dividuals; there is also a strong desire to con- 


form to the standards set by young people. 


their own age. 

On the other hand, there may be much 
deliberate imitation of the attitudes and ac- 
tions of adult associates, especially of those 
they admire. 

There is a strong need to be independent; 
there is also a strong need to be dependent. 

Language is ostentatiously colorful, slangy 
and emotional. Adolescents’ use of profanity 
or obscenities may actually be an attempt tO 
appear sophisticated and mature. 

The adolescent is typically group-minded; 
he wants to ‘belong, and will respond readily 
to the idea of teamwork.” 


in a Steel Company 


19, 1953 issue of The Iron Age, pp- 61-62. 


The following interesting chart was included 
in the article: 


Industrial Psychology 


Better Workers 


Employee Testing 
d 


Employee Counseling — 


Personnel Auditing 


Better Management 


Job Satisfaction 


Longer Odds Against Accidents 


Attracts Desirable Employees 


Minimizes Misfits 


~ oa e 


S 


Book Reviews 


Kephart, Newell C. The employment inter- 
view in industry. New York, McGraw- 
Hill, 1952. Pp. 277. $4.50. 


The book covers the main items regarding 
content and method of the employment in- 
terview and a lot of other material about em- 
ployment procedures. It is on- this point 
that the author might be criticized, namely 
-including so much other material in a book 
entitled Employment Interview. It would 
seem from the discussion that the interviewer 
gives tests and visual examinations, diagnoses 
mental maladjustments and interprets all 
available predictors of the criterion. To be 
sure this is all pertinent to the process of 
hiring and the discussion of these aspects is 
sound. Actually there is scarcely enough ma- 
terial on the face-to-face aspect of the em- 
ployment interview to make a respectable 
book and the author presumably did what any 
of us would have done under the circum- 
stances. 

The initial chapter sets the place of the in- 
terview with reference to usual employment 
procedures. Then follow chapters about the 
content of the interview and some “how to” 
aspects. The first of these involves knowl- 
edge of the job with due reference to the Dic- 
tionary of Occupational Titles and various 
blanks devised by the War Manpower Com- 
mission. It is helpful to have some of this 
WMC material in a handy form. or 
& ith evaluating past EXP” < 
Chapter deals with e jies, ayes zud 
yee by means of job fames E e egt of 
tene Four of the Dictionary: ‘ 
ieee and of motor abil A 
a. The author cautions Aia 
ede intelligence from the P ng vocabu- 
l ry nesrindicatesomne BEE vnifested in the 
interon, sentence structure aut indirect evi- 

fie S that might give 5° 1 

p z ecific 

E chapter on personality jnatudes S dur- 
pos of behavior that might be ° ses clinical 
a an interview. It also wey jnteresting 
ug Ptoms and syndromes: and a job for an 
= ey is attempting to lity deviations 

Cant with serious person to some kin 

© may, nevertheless, adjust 


of work. This is a commendable acceptance 
of industry’s social responsibility. In con- 
nection with physical demands of the job 
there is due emphasis on vocational possibili- 
ties for persons with physical disabilities. 
Emotional maturity is mentioned as especially 
important for leadership jobs and some spe- 
cific interview questions are suggested which 
might bring out emotional maturity. 

The last two chapters deal more specifi- 
cally with the mechanics of the interview— 
what the reader would anticipate from the 
title of the book. One considers preliminary 
preparation—the actual environment of the 
interview and the application forms. There 
is a tabulation of items involved in a consid- 
erable number of application forms which 
might help someone in devising a form of his 
own. With reference to the actual conduct 
of the interview there is emphasis on the 
avoidance of bias and of stereotyped methods. 
The patterned interview is recommended on 
the basis of some experimental studies of re- 
liability and validity. The over-all conclusion 
appears to be that the interview is needed to 
supplement tests because the latter do not get 
at everything needed in the job and do not 
have perfect validity. The reviewer is dis- 
posed to add that as we perfect objective per- 
sonality tests the importance of the interview 
may ultimately decrease. A final topic is the 
importance of giving the applicant adequate 
information about such things as hazards, 
morking conditions and possibilities am the 

job, This aspect might very profitably re- 
ceive more gttess in the discussion. 

The book is moderately well documented 
t the end of each chapter to 
a few pertinent experimental studies. The 
general level is fairly elementary except for 
an occasional mention of something like mul- 
tiple correlation and the book evidently is 
designed for the person without much tech- 

a packground. There are a lot of wise 
nic tions for an interviewer, such as not being 
ere d by a good talker. There are also a lot 
m hits as to what kind of questions to ask 
in order to bring out indirectly some aspect 
af personality. Finally, there is a wholesome 


with references 4 


240 Book Reviews 


emphasis on the social aspects of the employ- 
ment program and of industry’s responsibility 
for the over-all adjustment of the worker. 


Harold E. Burtt 
Ohio State University 


Wechsler, David. The range of human ca- 
pacities. Second Edition. Baltimore: 
Williams and Wilkins Co., 1952. Pp. 190. 
$4.00. 


This is a revision of Wechsler’s 1935 book 
of the same title. New chapters on produc- 
tive operations and span of life have been 
added and the chapter on the effect of age 
has been enlarged. There has been minor 
rewriting in many parts of the book and it 
has been completely reset. 

The purpose of the book is still “to show 
that human variability, when compared to 
that of other phenomena in nature is ex- 
tremely limited, and that the differences 
which separate human beings from one an- 
other . . . are far smaller than is ordinarily 
supposed.” 

In pursuit of this objective, Wechsler 
gathers data concerning human capacities 
(defined to include such diverse measures as 
temperature, height, reaction time and in- 
telligence) and compares the score of the 
lowest person with the highest person within 
the normal population. Normal population 
is arbitrarily defined as excluding one-tenth 
of one per cent of the total Population at each 
extreme. The comparison effected is in terms 
of the range ratio which is simply the high- 
est score or value divided by the lowest. 
Wechsler notes that these range ratios are 
small (i.e., less than 5) and asserts that they 
are in the nature of natural constants. A 
hierarchy of range ratios is postulated ex- 
tending from about 1.30: 1 in the case of 
linear traits (such as stature and arm length) 
to about 2.5:1 in the case of Perceptual and 
intellectual abilities. It is implied that the 
“real” upper limit is probably the growth 
constant e (2.7182) and an attempt is made 
to show that the orderly hierarchy of ratios 
is a function of the number of factors in- 
volved in the various human capacities. 

In the new chapter on Productive opera- 
tions, data are reviewed on employee pro- 


ductivity and it is concluded that a ratio of 
2.00:1 expresses the difference between the 
best and poorest workers. This is interpreted 
as indicating that one cannot expect much 
from selection techniques and need not be 
concerned about uniform pay vs. sliding pay 
scales. 


The chapters on Length of Life, Exceptions, 
and the Burden of Age, while interesting, are 
extraneous to the major theme. For example, 
expected life span gives very large range 
ratios at whatever period of life the ratios 
are computed and Wechsler concludes that 
life span is either not a capacity or that the 
data are badly contaminated. 

In the chapter on Genius and Deficiency 
Wechsler embraces the theory of critical dif- 
ferences to explain both ends of the con- 
tinuum. After a given quantitative change, 
he asserts, qualitative distinctions appear- 
The ability to see new relationships might be 
such a change at the upper end of the scale. 
This qualitative change, it is maintained, ac- 
counts for differences which greatly exceed 
the “mere 50 IQ points” which separate the 
genius and the idiot from the average. In his 
last chapter on the Meaning of Differences, 
however, Wechsler apparently forsakes this 
line of thinking and returns to the refrain that 
if the range ratios yield small numbers, then 
the differences which separate men are incon- 
sequential. 

An appendix on mental measurement and 
one containing his basic data complete the 
book. 

To this reviewer, the book suffers from three 
major confusions. First, it seems obvious 
from the frequent reference to social signifi- 
cance, democracy, rule by the elite, etc. that 
Wechsler feels that to believe in democracy 
one must demonstrate that all people are 
really equal in everything from body tem 
Perature to test scores. This, of course, 1$ 
a confusion of value judgments with descrip” 
tive physical and psychological statements: 
Tt is perhaps a common confusion but shou 
be deplored all the more for that fact: 
Second, while Wechsler knows the rules kor 
measurement and the necessary prerequisite? 
or making meaningful ratios, he apparent 
does not apply these rules to all of his dat? 


Book Reviews 241 


and reports range ratios for Mental Ages. 
1Q’s and number of items correct in a vocabu- 
lary test. It is obvious that these ratios have 
no significance since they do not have known, 
meaningful zero points and equal units. 
Third, while Wechsler recognizes that words 
like “small” and “large” are judgmental words 
whose meaning is not clear without a set of 
references, he persists in saying that because 
range ratios can be expressed in “small num- 
bers” (arbitrarily defined), they are “small.” 
And, since they are “small,” the differences 
between people are “small” and this has great 
social significance. In other words, “small” 
is defined in one context and applied in an- 
other where it carries added meanings. 

The work is further marred by misinterpre- 
tation of the data and a running series of 
errors. In the chapter on productive opera- 
tions, for example, ten range ratios are intro- 
duced as evidence: 1.73, 2.00, 2.04, 2.10, 2.30, 
2.53, 2.55, 2.57, 2.83, 3.00. From this array 
it is concluded that the range of productivity 
is “. . . not more than 2.5:1 and generally 
more nearly 2.0:1” (!). No evidence is ever 
given for the repeated statement that ¢ is the 
probable upper limit of the ratio. Reference 
is made to figures and tables which disagree 
with the text (e.g., Figure 5, Table 7); a 
significant claim is made about modes in the 
data, one of which is nonexistent, etc. The 
invitation to check the results by recalculat- 
ing the data in the appendix is not reassuring. 
In a cursory examination of these data the 
reviewer found fifteen cases of considerable 
error. Many of these errors also appear else- 
where in the book. Either the data have been 
misprinted, the original range ratios miscal- 
culated, or both, In two cases the data are 
patently impossible. : 

Wechsler’s basic notion of the range ratio 
offers interesting and intriguing research ideas 
when confined to the kinds of data to which 
it legitimately applies. In the present T 
text its potential value appears to be burie 
in a host of confusions and irrelevancies. 
There appears to be no more need for the 
second edition of this book than there was 


for the first edition. James J. Jenkins 


University of Minnesota 


Division of Occupational Analysis, United 
States Employment Service. Dictionary 
of occupational titles, second edition. 
Washington: United States Government 
Printing Office, 1949. Volume I, Defini- 
tions of Titles, Pp. xxviii + 1518, $4.00. 
Volume II, Occupational Classification and 
Industry Index, Pp. xxvi + 743, $2.50. 


Users of the DOT should welcome the 
Second Edition because it provides them with 
more occupational information in more usable 
form than did the early edition. Volume I 
contains the job definitions including those 
from the old Part I, the various supplements, 
and additional new definitions. The appen- 
dices from the original edition (Glossary; 
Index of Commodities to assist in classifying 
Sales Personnel; Occupational Titles Ar- 
ranged by Industry) have been moved to 
Volume II: of the Second Edition. Other 
readily apparent changes are the introduction 
of a double alphabetic scheme of presenting 
definitions and a considerable simplification 
of the reference techniques. The main alpha- 
betic listing presents every job and occupa- 
tional title by straight letter alphabetization. 
This is a desirable change over the former 
word alphabetic listing which was more trou- 
blesome to users because of the numerous 
compound and multi-word titles. For ex- 
ample, in the original edition CELLAR 
WORKER preceded CELLARMAN, while 
in the new edition the order is reversed. 
Within the main listing are indented sub- 
listings of job definitions most closely related 
to the base definition; thus, users are saved 
the time of locating these definitions through- 
out the volume and can more readily compare 
the different definitions. The variety of 
reference phrases found in the first edition are 
now reduced to the words “see” and “see 
under.” Teachers and others providing in- 
struction in DOT usage will join the user in 
approving these changes. 

Changes not immediately apparent have 
also been effected. Coverage has been ex- 
panded within the professional occupations 
as well as within several industrial categories. 
Codes now accompany all job definitions 
previously referred to classification titles. 
Four of the so-called grouping title defini- 


242 


tions have been eliminated and type-of-work 
classifications introduced for many laboring 
jobs. Such changes, coupled with the double 
alphabetic listings, mean that thousands of 
additional jobs are now readily codable, which 
is a marked contrast to the cumbersome mul- 
tiple reference processes required with the 
first edition. 

Glimpses of other possible improvements 
are found by comparing the Second Edition 
with the former publications for such classifi- 
cations as CHEMICAL ENGINEER, ELEC- 
TRICAL ENGINEER, and CHECKER 
(clerical) III. Redundancy, overlapping, and 
repetitious classifications have been substan- 
tially eliminated with no significant loss of 
occupational information. It is regrettable 
indeed that publication was not delayed until 
a host of similarly needed changes were made 
—also until the remaining grouping title defi- 
nitions, so frustrating to users, were elimi- 
nated. 

A serious complaint against the original 
edition was the amount of training time re- 
quired to achieve proficiency in its use. 
Here, again, the Second Edition is an im- 
provement—experience having already shown 
that training time is about one third less. 

The general format of the Occupational 
Classification in Volume II remains essentially 
the same with the different titles being readily 
identifiable so that users may locate those 
with definitions in Volume I directly. Users 
will be pleased to note the elimination of the 
LABOR, PROCESS classifications. 


This reviewer strongly feels that the kinds 
of improvements found in the Second Edition 
should have been extended throu 
additional occupational areas. 

Alan M. Kershner 


Personnel Research Center, Inc., 
Arlington, Va. 


ghout many 


Prasad, Kali. Fatigue and effi 
tile industry. Lucknow , Indi 
now Press, 1950. Pp. 
or 2s. 3d. 


ciency in tex- 
ndia: Univ. Luck- 
mi + 34. Rs, 1/8 


This is one report in a contin 
research studies begun in 1947 
deshi Cotton Mills, India. 


uing series of 
in the Swa- 
Four operations 


Book Reviews 


were studied in detail. Output data were 
analyzed by hours, days, months, and shifts. 
Psychophysical tests were administered to 
some employees, and a questionnaire was ad- 
ministered to a sample. 

It is difficult to evaluate this book properly 
because of cultural differences in the degree 
of psychological sophistication between India 
and the United States. Further, this book is 
not the final report of the entire series of 
studies. 

From the viewpoint of United States psy- 
chology, this study is weak in several im- 
portant respects. Fatigue is defined as “a 
condition . . . caused by activity in which the 
output produced by that activity tends to be 
rather poor, and the degree of fatigue tends 
to vary directly with the poverty of output. 
It appears to this reviewer that this defini- 
tion also covers “monotony,” for example. 
It might be better to concentrate on varia- 
tions in output, and forget the fatigue. 

The mill had 9,404 employees, but most 
data refer to extremely small samples, i.e- 4, 
8, 33, etc. The “criterial level” or “ideal pet 
formance” of each worker was based on 2 
one-half hour sample of his output. These 
samples are not only too small, but also sub- 
ject to disturbing variables such as the “Haw 
thorne effect.” It is not always clear whether 
data were “experimentally” collected, or take? 
from routine records. The significance ° 
much of the data is not clear. 

Since Indian psychology and economy até 
both rather new, it is possible that this is 2” 
important study in India. The present book 
1S not particularly valuable to Americans- 


Perhaps the final report of the whole study 
will be useful. 


Harold F. Rothe 


American Hospital Supply Corporation, 
Chicago, Ilinois 


Lauer, A. R. Learning to drive safely. Min 
neapolis: Burgess Publishing Co., 1942: 
Pp. 145, $2.25, 
This manual con 

presents ą 

course for thi 

ing instructo 


veniently and precisely 
well-conceived driver traini? p 
e course administrator, the d 
r, and the student. In the wO” 


4, 


Book Reviews 


of the author: “This manual is the result of 
twenty years’ study of drivers’ aptitudes, 
habits, abilities, and disabilities, in addition 
to ten years’ experience in teaching drivers 
and instructors of driving at Iowa State Col- 
lege. Every step outlined has been carefully 
tested and evaluated for difficulty, order of 
presentation, and usefulness. . . .” 

Duties and responsibilities of all connected 
with the course from administrator to student 
are specifically prescribed, including such de- 
tails as solicitation and payment of fees and 
principles and procedures to protect students 
and equipment. The course material itself 
Consists of ten basic units of instruction, 
Which may be covered in ten or more lessons. 
Each unit contains an introduction directed 
to the student, an outline of skills to be mas- 
tered, a few reference readings, a list of ques- 
tions, and a student’s report form. A valu- 
able appendix contains specific suggestions 
for handling classes, suggested administra- 
tive forms, psychophysical and psychological 
tests, a list of equipment needed, and a list 
Of films and visual training aids. 

While the manual is directed to a non- 
Professional audience, there are two items of 
Particular interest. to psychologists. First, 
the author considers the development of 
Proper attitudes toward driving as a most 
‘portant part of the course. He stresses the 
Need for the course to begin with reading 
and classroom discussion and exercises in 
Order “to broaden their interest ìn zoa a 
mg and the philosophy of safety educa ee 
Secondly , the appendix does contain @ 


Sectio : tion and use of psy- 
n on the interpreta tests. One 


Thophysical and psychological i 

Opes that the ai of the tests Gnager: 
the manual will seek advice from The re- 
Qualified to interpret tests results. this ac- 
Viewer finds a statement encouraging 

ton Conspicuously lacking. 
i n summation, 
“mendous aid to the schoo! at 
Planning to establish a driver tra rse. Its 
°F seeking to improve an existing ormi $ Sio 
"mary value to psychologists, as ‘jing the 
all Citizens, will not come from Towering p 
anual, but will accrue from Tas 


243 


the accident rate as more and better formal 
training courses are established. 
Stanley E. Jacobs 


_ Department of the Army, x 
Washington, D. C. 


Shostrom, Everett, and Brammer, Lawrence. 
The dynamics of the counseling process. 
New York: McGraw-Hill Co., 1952. Pp. 
xvi + 213. $3.50. 

A book designed to meet the present needs 
of counseling should certainly deal with, 


among other things, problems of definition. 
real or apparently conflicting theories and 
practices, foundations in general psychology, 
especially learning and motivation, and the 
role of counseling in its various settings. This 
book was apparently so designed. Unfor- 
tunately, it falls short of the mark at almost 
every point. 

Its core is a description of the “self-adjus- 
tive” approach which represents one more at- 
tempt to synthesize the Minnesota and Chi- 
cago positions. It is Rogerian in its major 


‘features but tries to make room for testing 


and informational procedures. It is defined 
as “. . . counseling which assists the client 
to become more self-directive and self-re- 
sponsible” (p. 2). While it may be more 
appealing to state a definition in terms of 
goals rather than in terms of operations, it 
is probably less scientifically useful. For- 
tunately, this does not affect the discussion of 
actual methods which is fairly well done and 
describes procedures that have been followed 
for some time in the better counseling centers. 

Concerning the synthesis of opposing points 
of view, the authors furnish their own best 
criticism: “It is this middle-of-the-road stand, 
taken by so many counselors, which has 
created more confusion than clarification in 
counseling methodology” (p. 4). For despite 
their diagrams and denials, they are, if not 
on a continuum (which probably does not 
exist), certainly in between. 

In the process, they tilt at the usual wind- 
mill, directive counseling, and bandy about 
the usual tired, emotionally-toned, invidious 
comparisons in which the good Gie., Rogerian, 
permissive, Or self-adjustive methodology) is 


244 


characterized as “democratic” as in the fol- 
lowing: “It would appear that the basic as- 
sumptions of democracy and those of client- 
centered therapy are one and the same...” 
(p. 10). Their béte noire is Williamson’s 
fifteen-year-old text on counseling. 

The authors’ attention to a systematic 
basis on learning theory for their method is 
commendable, but the reviewer was puzzled 
by their use of John Dewey’s 1933 book as 
the principal (and almost exclusive) core of 
the presentation. The names of Hull, Guth- 
rie, and Tolman do not appear, and regretta- 
bly little is made of the cited works of Dol- 
lard and Miller, Mowrer, and Shoben. 

The Stanford Guidance Study is presented 
as an example of research on counseling. In 
it, “Feeling tone . . . was the criterion used to 
evaluate the effectiveness of counseling” (p. 
41). Their conclusions are based on differ- 
ences between ratings of such feelings; differ- 
ences are described as significant, but no sta- 
tistics are presented to document this. 

Methodologically, they have accepted 
several doubtfully valid notions. For ex- 
ample, they say, “It is assumed that clients 
are capable of selecting their own tests” (p. 
74). Also, they suggest that, “Perhaps if 
counselors would concentrate less on the limi- 
tations of students and more on the limita- 
tions of test data, the quality of guidance 
would improve” (p. 29). Perhaps! But this 
is a rather naive criticism of one of the best 
developed aspects of counseling. And it is 
difficult to see how the authors could have 
failed to see the implications of their state- 
ment that, “The only (italics added) indica- 
tors that anxiety has been reduced are the 
client’s feelings expressed toward himself and 
the counseling services” (p. 151). 

While the foreword makes much of the fact 
that this book conceives of counseling as an 
integral part of education, the actual discus- 
sion of the role of counseling in colleges and 
universities is limited to six pages of quite 
superficial description of needs. The train- 
ing of counselors is dealt with in fourteen 
lines which emphasize the value of electrical 
recordings; these presumably give the trainee 


a knowledge of rather than a knowledge about 
counseling. 


Book Reviews 


This book left the reviewer with one over- 
riding impression: that it is undigested. The 
authors are obviously enthusiastic and am- 
bitious; they are aware of the major needs of 
counseling; they have written well and made 
their points forcefully. Yet the total prod- 
uct is, regrettably, unsatisfactory. 

John W. Gustad 

University Counseling Center, 

University of Maryland 


Guetzkow, H. (Ed.). Groups, leadership and 
men. Pittsburgh: Carnegie University 
Press, 1951. Pp. ix, 293. $5.00. 


This book presents progress reports of five 
years (1945-1950) of contract research 1m 
Human Relations sponsored by the Office of 
Naval Research. These twenty reports are 
revisions of papers given at a mutual stock- 
taking conference which was held at Dear- 
born, Michigan in September, 1950. Psy- 
chologists predominate among the contribu- 
tors which also include sociologists, political 
scientists, economists, and journalists, all of 
whom were members of the research teams 
involved in the undertaking. j 

The book is divided into the three main 
sections suggested by the title. About hal 
of the space and total number of reports are 
included in the first section, which deals wit 
research on the behavior of groups. R- 
Cattell introduces this section with the formu- 
lation of methodology and basic concepts: 
Following this discussion are a number of Te 
ports by leading members of several Unive" 
sity of Michigan research centers. Among 
the subjects treated are components of group 
morale, the effects of communication on 20™ 
conformists, workers’ loyalties to union ant 
Management, and factors making for group 
productivity. The section is concluded wit 
Margaret Mead’s paper on research in C0 
temporary cultures, a 

The second section deals with problems °° 
leadership. Topics discussed include: the n 
fluence of the group in determining 1624€" 
ship style, the relation of the follower’s p 
sonality to the leader, and leadership eff 
tiveness at the production level. ith 
__ In the final section, which is concerned W 
individual behavior, the psychological re® 


—— 


~ 


Book Reviews 245 


is brought back to terra firma. New light is 
cast on traditional problems of measuring 
motivation, the relationship of verbal be- 
havior to the reasoning process, and the ad- 
vantages of neuropsychiatric screening. 

Some technical detail has been omitted 
from the original reports in order to make 
them suitable for a wider readership. How- 
ever, references at the end of most chapters 
have been supplied to aid the more curious 
social scientists in following up these over- 
views. 

A service is rendered the reader by John G. 
Darley who has contributed introductory and 
concluding chapters designed to give perspec- 
tive and integration to an assortment of com- 
petent but somewhat discontinuous reports 
of on-going research. The reader who is more 
concerned with practical military application 
is accommodated by a discussion at the end 
of the book, and those interested in securing 
contract subsidy for their projects will find 
the appendix helpful. 

In general, the content of the book is more 
of a prologue to a new social psychology than 
a report of substantial achievement. The 
atmosphere is one of more problems raised 
than solved, and the predominant theme is 
“further research needed.” But there is a 
healthy respect for the canons of scientific 
method by the seasoned researchers who have 
contributed to this volume. Although the 
reports deal with path-finding in new terri- 
tories, the projects generally involve prob- 
lems that are reduced to testable hypotheses. 
The generalizations are for the most part 
tentative and limited to the data actually in- 
volved, with a notable absence of intuitive and 
sweeping conclusions. Nevertheless, the re- 
viewer concurs with Darley in the expressed 
need for more synthesis and higher order gen- 
eralization, since the visions gained by this 
exploratory work may tend to be obscured 
by the trees. Another source of uneasiness 
also made explicit by Darley is the insuf- 
ficient consideration of the role of abilities 
and interests as determinants of group be- 
havior. Since psychology has had consid- 
erable success in these areas, even a prologue 
may benefit from the past. 

This volume is a useful source of supple- 


mentary reading for students in social psy- 
chology, and is of particular interest to the 
multitude of social scientists now in the em- 
ploy of various Armed Forces Human Re- 
sources programs. It has a wider appeal than 
most technical publications of in-service mili- 
tary groups, for the emphasis is on basic re- 
search which the Navy has so far-sightedly 
underwritten. Also, it is a good example of 
what may emerge from the large-scale insti- 
tutional research which has become another 
sign of our times. 
Abraham S. Levine 


Bureau of Naval Personnel, 
Washington, D. C. 


Curran, C. A. Counseling in Catholic life 
and education. New York: The Macmillan 
Co., 1952. Pp. 462. $4.50. 


This book is a new approach to counseling 
in a number of ways. It is new in combining 
an accurate knowledge of modern counseling 
techniques as they have developed in the 
fields of psychology and education in America 
with the Thomistic and Aristotelian concepts 
of the virtues. In addition, it definitely re- 
lates religion and counseling together. 

In its technical presentation, this book 
clearly distinguishes counseling from guid- 
ance and so opens the way for the use of both 
types of relationships with persons who come 
for help. Curran defines counseling as “a 
definite relationship where, through the coun- 
selor’s sensitive understanding and skillful 
responses, a person objectively surveys the 
past and present factors which enter into his 
personal confusions and conflicts and, at the 
same time, reorganizes his emotional reac- 
tions so that he not only chooses better ways 
to reach his reasonable goals, but has suf- 
ficient confidence, courage, and moderation to 
act on these choices.” Elsewhere he has 
defined guidance as “a relationship in which 
a person equipped in a particular field sup- 
plies pertinent facts to an immediate personal 
need. Guidance readiness occurs,” he says, 
“when a unique convergence of events in a 
person’s personal life makes a particular kind 
of information far more meaningful at a given 
point in life than it would be at any other 


period.” In this book, the discussion of coun- 


246 


seling presupposes adequate instruction and 
information obtained from teaching or guid- 
ance and treats these questions only to ex- 
plain more clearly some aspect of counseling 
or reasons for a particular kind of counselor 
method. 

The book is made up of five parts. The 
first part includes some important recent de- 
velopments in counseling. The second sec- 
tion delineates the process of personal in- 
tegration as it occurs in counseling from the 
point of view of the person who comes for 
counseling. This is of special interest for 
beginning counselors who may not have an 
experiential grasp of a counseling series as 
the person goes through it. The experienced 
counselor, too, may find, as did the present 
reviewer, numerous considerations not treated 
in other books on counseling. Part IIT pre- 
sents the counselor’s side by unfolding the 
skill of the counselor as it varies throughout 
the different phases of counseling. This sec- 
tion gives a detailed exposition of the coun- 
selor’s skill in each of the five stages of coun- 
seling described: establishing the relationship, 
initiating counseling dynamics, later phases, 
and the final stages of counseling. A chapter 
on skills with children is also included. Most 
of this part is given over to the different 
methods of the counselor’s responses so that 
deepest content of a person’s statements may 
be objectively unfolded and reflected. The 
detailed excerpts from actual interviews are 


Book Reviews 


exceptionally suitable for a careful study of 
the counselor’s skill. Part IV, the approach 
to counseling, is directed to increasing coun- 
selor sensitivity to counseling atmosphere, to 
disguised expressions of counseling need, and 
to the ways in which informational and 
guidance roles may facilitate counseling. Tt 
includes also a chapter on group discussion 
and group counseling. The concluding chap- 
ter is an integration of counseling with re- 
ligion. This is especially valuable since both 
counseling and religion aim at aiding a pers 
son to be more at peace with God and him- 
self, happier, and more able to lead an in- 
dependent, responsible, achieving life. , 

While this book has a definitely Catholic 
application as its title indicates, yet the title 
could be misunderstood and therefore mis- 
leading. The content of the book would 
readily be shared, in this reviewer’s opinion, 
by chaplains of any denomination as well as 
by psychologists, psychiatrists, and educators 
and, in fact, by any persons who have active 
religious beliefs and convictions and wish to 
see how such a religious point of view ca? 
be integrated with modern methods of coun- 
seling and guidance. A special merit of the 
book is that it achieves this integration with- 
out losing any of the rigor and exactness of 
a careful scientific study. 


Robert J. Sherry 
Hq. Army Field Forces, 


Fort Monroe, Virginia 


~ 


New Books, Monographs, and Pamphlets 


Books, monographs, and pamphlets for listing and possible review should be sent to Donald G. Paterson, 
Editor, Department of Psychology, University of Minnesota, Minneapolis 14, Minnesota 


Understanding that boy of yours. Melbourne S. Ap- 
plegate. Washington, D. C.: Public Affairs Press, 
1953. Pp. 52. 

Rudolf Pintner, in memoriam. Seth Arsenian, Ed. 
Washington, D. C.: Gallaudet College Press, 1953. 
Pp. 63. 

Innovation, the basis of cultural change. H. G. 
Barnett. New York: McGraw-Hill Book Co., 
Inc., 1953. Pp. 462. $6.50. 

Getting along with people. Eugene J. Benge. New 
London: Bureau of Business Practice, National 
Foremen’s Institute, Inc., 1952. Pp. 29. §.25. 

Practical psychology. Karl S. Bernhardt. Second 
edition. New York: McGraw-Hill Book Co., Inc., 
1953. Pp. 337. $3.75. 

Psychoanalytic theories of personality. Gerald S. 
Blum. New York: McGraw-Hill Book Co., Inc., 
1953. Pp. 219. $3.75. : 

Social factors related to job satisfaction. Research 
Monograph No. 70. Robert P. Bullock. Colum- 
bus: Bureau of Business Research, Ohio State Uni- 
versity, 1952. Pp, 105. $2.00. g ” 

Human relations I. Cases in concrete social science. 
Hugh Cabot and Joseph A. Kahl. Cambridge: 
Harvard University Press, 1953. Pp. 273. $4.25. 

Human relations WU. Concepts in concrete social sci- 
ence. Hugh Cabot and Joseph A. Kahl. Cam- 
bridge: Harvard University Press, 1953. Pp. 333. 
$4.75. N 

Phantasy in childhood. Audrey Davidson and Judith 
Fay. New York: Philosophical Library, 1953, 
Pp. 188. $4.75. f . 

Marriage, morals and sex in America. Sidney Dit- 
zion. New York: Bookman Associates, 1953. Pp. 
440. $4.50. 

Statistics in psychology and education. Henry E. 
Garrett. New York: Longmans, Green and Co., 
Inc., 1953. Pp. 460. $5.00. 

The human senses. Frank A. Geldard. New York: 
John Wiley and Sons, Inc., 1953. Pp. 365. $5.00. 

The intimate lije. J. Norval Geldenhuys. New 
York: Philosophical Library, 1952. Pp. 96. $2.75. 

Solitude and privacy. Paul Halmos. New York: 
Philosophical Library, 1953. Pp. 181. $4.75. 

Writing clinical reports. Kenneth R. Hammond and 
Jeremiah M. Allen, Jr. New York: Prentice-Hall, 
Inc., 1953. Pp. 288. 

Selected case problems in industrial management. 
Paul E. Holden and Frank K. Shallenberger. New 
York: Prentice-Hall, Inc., 1953. Pp. 318. $3.75. 

The changing culture of a factory. Elliott nr 
New York: Dryden Press, 1952. Pp- Si r j 

The Vienna circle. Victor Kraft. New York: Philo 
sophical Library, 1953. Pp. 209. $3.75. 

is in autobiography, Vol. IV. 

A history of psychology in auto g! F Edi- 

Herbert S. Langfeld and Edward G. Boring, 


tors. Worcester: 
Pp. 372. $7.50. 

Human factors in air transportation. Ross A, Mc- 
Farland. New York: McGraw-Hill Book Co., 
Inc., 1953. Pp. 830. $13.00. 

Children in play therapy. Clark E. Moustakas. 
New York: McGraw-Hill Book Co., Inc., 1953. 
Pp. 213. $3.50. 

Freudian psycho-antics, fact and fraud in psycho- 
analysis. Maurice Natenberg. Chicago: Regent 
House, publishers, 1953. Pp. 101. $2.00. 

How to improve classroom testing. C. W. Odell. 
Dubuque: Wm. C. Brown Co., 1953. Pp. 156. 
$3.00. 

The Wechsler-Bellevue scales, a guide for counselors. 
C. H. Patterson. Springfield, Il.: Charles C 
Thomas, publisher, 1953. Pp. 146. $3.75. 

A manual for administrative analysis. John M. 
Pfiffnmer and S. Owen Lane. Dubuque: Wm. C. 
Brown Co., 1953. Pp. 88. $2.50. 

The best years of your life. Marie Beynon Ray. 
Boston: Little, Brown and Co., 1952. Pp. 300. 
$3.95. 

Administering the elementary school. Reavis, Pierce, 
Stullken and Smith. New York: Prentice-Hall, 
Inc., 1953. Pp. 384. 

Cases of public personnel administration. Henry 
Reining. Dubuque: Wm. C. Brown Co., 1953. 
Pp. 142. $3.00. . 

The Soviet impact on society. Dagobert D. Runes, 
New York: Philosophical Library, 1953, Pp. 202. 
$3.75. 

Science and human behavior. B. F. Skinner. New 
York: The Macmillan Co., 1953. Pp. 461. $4.00. 

The stepchild. William Carlson Smith. Chicago: 
University of Chicago Press, 1953. Pp. 314. $6.00. 

New York television. Dallas W. Smythe. Urbana: 
National Association of Educational Broadcasters, 
Gregory Hall, 1952. Pp. 108. 

Readings in learning. Lawrence M. Stolurow, editor, 
New York: Prentice-Hall, Inc., 1953. Pp, 512. 
Your child and his problems. Joseph D. Teicher. 
Boston: Little, Brown and Co., 1953. Pp. 302. 

$3.75. 

Student deferment in selective service. M. H. Tryt- 
ten. Minneapolis: University of Minnesota Press, 
1953. $3.00. 

Motivation and morale in industry. Morris S. 
Viteles. New York: W. W. Norton and Co., Inc., 
1953. $7.50. 

Research in the international organization field. 
Some notes on a possible focus. Richard W. Van 
Wagenen. Princeton: Center for Research on 
World Political Institutions, Princeton University, 
1952. Pp. 78. 


Clark University Press, 1953. 


247 


248 New Books, Monographs, and Pamphlets 


New means of studying color blindness and normal 
foveal color vision. Gordon L. Walls and Ravenna 
W. Mathews. Berkeley: University of California 
Press, 1952. Pp. 172. $2.50. 

Philosophy and psycho-analysis. John Wisdom. New 
York: Philosophical Library, 1953. Pp.282. $5.75. 

Success in psychotherapy. Werner Wolff and Joseph 
A. Precker. New York: Grune and Stratton, 1952. 
Pp. 196. $4.75. 

Research into the causes of feeblemindedness. A 
symposium. Utica: State Hospitals Press, 1952. 
Pp. 37. 

Industrial development at home and abroad—prob- 
lems and prospects. Financial Management Series 
No. 101. New York: American Management As- 
sociation, 1952. Pp. 28. $1.25. 

Interracial practices in the YMCA. National Study 
Commission on Interracial Practices in the YMCA. 
New York: Association Press, 1953. Pp. 48. 
$1.00. 

Personnel administration and the development of the 
personnel staff. Personnel Planning Project No. 
20-6-51. Director, Personnel Planning, DCS/P 


Headquarters, Air Training Command, 1952. Pp. 
106. Gratis, limited number of copies available. 
Preparing employees for retirement. Personnel Se- 
ries No, 142. New York: American Management 
Association, 1951, Pp. 27. $1.25. i 
Operating problems of personnel administration. 
Personnel Series No. 144. New York: American 
Management Association, 1952. Pp. 40. $1.25. 
Practical approaches to supervisory and executive de- 
velopment. Personnel Series No. 145. New York: 
American Management Association, 1952. Pp. 42. 

$1.25. 
Spotlighting the labor-management scene. Personnel 
Series No. 147. New York: American Manage- 


ment Association, 1952. Pp. 43. $1.25. we 
Theses in the social sciences. UNESCO. New York: 
Columbia University Press, 1952. Pp. 236. $1.25. 


Comparative survey on juvenile delinquency. United 
Nations. New York: Columbia University Press, 
1952. Pp. 132. $1.00. 

Trafic in women and children. United Nations. 
New York: Columbia University Press, 1952. PP- 
43. $.40. 


Journal of Applied Psychology 


VoL. 37, No. 4 


AUGUST, 1953 


Socio-Psychological Factors in Industrial Morale: II 


Raymond E. Bernberg 


Los Angeles State College 


Previously reported research (1) compared 
the predictive ability of different tests of mo- 
rale for performance indicators in the work 
Situation. The tests of morale were based 
upon current concepts; viz.: group morale 
(GM); employee attitude toward the com- 
pany (CM) (ie., acceptance of the formal 
organization by its members) ; rating of the 
supervisors by the workers (S); and self- 
rating of morale by the workers (SM). Per- 
formance indicators used were absences, tar- 
diness, short time absences, medical-aid unit 
visits, and merit rating. The results indicated 
a general failing of all tests of morale to have 


Table 1 


Intercorrelations of the Tests of Morale 


made of the predictive value the other three 
tests of morale had for it. 

Table 1 contains the Pearson product mo- 
ment intercorrelations of the four measures 
based upon 890 cases. The results of the 
multiple regression analysis are in Table 2. 
It appears that GM with a correlation of plus 
-67 with SM has contributed almost the en- 
tirety of the multiple R of plus .69. 

This result gives much validity to the con- 
cept of morale as a group phenomenon as 
measured by the group morale test. This, of 
course, is based upon the assumption that the 
collective opinions of the workers themselves 


Table 2 
Beta Weights and Multiple R with SM as Criterion 


GM CM S SM 1 (GM) 2 (CM) 3 (S) 
= rq - = 77 1 67 B 71 —.16 16 
x 77 — B a |1 = 
cut 4 Ke a “49 Ross = .69; oR = .02. 
SM 07 aT 89 j OE 
ae p —= = are an adequate criterion for appraising mo- 


much predictive value for any of the perform- 
ance indicators.' 

An interesting result showed up when SM 
(a thermometer scale requesting the worker 
to check along a 0 to 100-degree parameter 
with verbal referents the degree to which he 
agreed with the proposition that his work 
group had high morale exemplified by their 
working together as a well organized team) 


was taken as a criterion and a determination 


1 This conclusion is in conflict with the aoulttple 
Correlation of .71 between six objective ie oi 
efficiency and morale scores. See Giese, J: a an 
Ruter, H. W. An objective analysis of morale. J. 
appl. Psychol., 1949, 33, 421-428. —Editor. 


249 


rale in the work situation. 

The group morale test (2) is a projective 
type paper and pencil test using the direc- 
tion of perception technique of attitude meas- 
urement. It is based upon content derived 
from six determiners discussed by Krech and 
Crutchfield (4). They are: 1) positive goals; 
2) satisfaction of accessory needs; 3) sense of 
advance toward goals; 4) level of aspiration 
and achievement; 5) time perspective; and 
6) feelings of identification, solidarity, and 
involvement. There are 34 items in the test, 
all equally weighted. 

The question which presented itself next 
was: If this test with its total content has 


250 


such a high relationship with the collective 
opinion of workers concerning their cognition 
of morale in their work group, which items 
with their specific content comprise the maxi- 
mal possible prediction for the test as a 
whole? This is an analytical question which 
in essence asks: What do the workers mean 
by morale? This of course concerns only the 
relationship of the content of the items of the 
test and the criterion. 

Gengerelli (3) has developed a technique 
of analysis whereby one may reduce a large 
battery of tests in their relation to a criterion, 
to a number of sub-tests which provide the 
maximal prediction of the criterion. This 
method describes the whole battery in terms 
of the smaller sub-set and provides a regres- 
sion equation to express the total correlation 


Table 3 


Intercorrelations of the Four “Factors” 
(Items) and SM 


Item 
Item SM 3 9 11 18 
SM = 17 14-32 82 + 
3 17 — 16 00 —.14 
9 ` .14 -16 — 12 —34 
11 .32 .00 12 = 17 
18 82 —.14 —34 17 — 


matrix. He states that this method yields the 
sub-tests as empirical tests which might be 
considered as “factors” of a sort. This pro- 
vides an immediate practical solution to an- 
swer the analytical question posed above. 
The SM measure was taken as the criterion 
and the 34 items of the GM test as the bat- 
tery of tests for the analysis. Because of the 
large number of cases, 100 of the 890 were 
selected randomly for determining the inter- 
correlations between the items (tetrachoric r) 
and the correlations between the items and 
the criterion (bi-serial 7). 
The results of the analysis produced 4 fac- 
tors which as a sub-set have a multiple R 
plus .96 with the criterion. Table 3 shows 
the intercorrelations between these items and 


with the criterion. Table 4 indic i 
beta weights. ae Weit 


Raymond E. Bernberg 


Table 4 
Beta Weights and Multiple R with SM as Criterion 
1 2 3 4 
(Item 3) (Item 9) (Item 11) (Item 18) 
8 218 AO .130 -820 


Roser: = 96; aRo.nu = -09. 


These four “factors” are the following items 
of the test: Item 3. “Scientific studies show 
that in groups such as yours, if you wanted 
to hold a big party, picnic or other type of 
friendly gathering, you would not care to in- 
vite: (a) 60% of your work group; (b) 35% 
of your work group. Item 9. “Statistics show 
continually that the increase in group produc- 
tion is a result of: (a) group effort towar 
step-up; (b) individual effort. Item 11. “Re 
cent industrial studies have shown that the 
following percentage of workers in a group 
such as yours gives a good amount of atten- 
tion to ways of getting ahead: (a) 60%; (P 
90%. Item 18. “A recent poll of workers ° 
groups such as yours found that workers got 
a lot of satisfaction from working together 
(a) infrequently; (b) frequently.” ; 

The possible importance of these findings 
need not lie in the measurement of morale 
per se. It appears that it would be wise I” 
developing and controlling work groups 
consider the following factors: (1) satisfac- 
tion of men from working together; (2) ie 
crease in production as a result of group ©” 
fort; (3) intimacy of workers with each other 
beyond as well as in the work surroundings’ 
and (4) the individual level of aspiration ! 
getting ahead, 


Received October 3, 1952. 


References A 
a pe 
i. Bernberg, R. E. Socio-psychological factors in fic 
dustrial morale: I, The prediction of ae 

zS indicators, J, soc. Psychol., 1952, 36, Pa 
- Bernberg, R. E. The direction of perceptich g pin. 


i n 

pry F attitude measurement. Int. J- op 

Ho Res, 1951, 5, 397-406. d 

3. Krech, D, and 29, $ “a 
i c . Theory `y, 

Praias of a nnan, R. $ Be 


f social psychology. New 
McGraw-Hill, 1948. = 


4. 5 g ich 
Gengerelli, J. A. A method of analysis in w 


the factors are empirical tests. Psy6 
1952, 33, 150-174. 


Ro ap 


i 


w 


THE JOURNAL OF APPLIED PSYCHOLOGY 
Vol. 37, No. 4, 1953 


Predicting Success in Dental School 


Wilbur L. Layton 


Student Counseling Bureau, University of Minnesota 


During the past several years, a trend in 
the field of testing has been indicated by 
the establishment of nation-wide testing pro- 
grams in which applicants for admission to 
certain schools, mainly professional schools, 
are tested in centralized programs. All par- 
ticipating colleges or schools then use the re- 
sulting test scores in the process of selecting 
and admitting students. , 

However, a test battery which is valid as 
a selection device at one institution may not 
be valid at another institution. Rarely do 
the administrators of the national testing 
programs publish validity data pertaining to 
Specific institutions. Hence most institutions 
Participating in a national program are doing 
so blindly unless they evaluate the test bat- 
tery in their own situation. 

Since the fall of 1946, the School of Den- 
tistry at the University of Minnesota has 
participated in the testing program sponsored 
by the Council on Dental Education of the 
American Dental Association by testing fresh- 
man classes at the beginning of the school 
year. During the first five years of this test- 
ing program, an attempt was made to deter- 
mine the relationship between various tests 
administered in the program and success in 
Dental School. In 1951 the Council on Den- 
tal Education decided that the test battery 
Currently used was adequate and should be 
used by dental schools to select students for 
their freshman classes. However, little evi- 
dence was available locally to indicate the 
Usefulness of the battery for selection pur- 
Poses, . 

Consequently, a study was designed that 
Would correlate pre-dental grades and the test 
data available through the national program 
With grades earned by dental school fresh- 
men who entered the University of Minnesota 
Dental School in the years 1946 through 
1949, First year grades in the dental “ee 
were the primary criteria for each of the our 
Classes, For the freshman class entering 1n 
1946 four year grades were also available as 


criteria. Separate grades in freshman courses 
in physiological chemistry, anatomy and 
prosthetics were also used as criteria for all 
four classes studied. These data were punched 
into Hollerith cards to facilitate analysis. 

Table 1 presents the means and standard 
deviations for the data which were available 
for the classes entering in the various years 
1946 through 1949. Table 1 indicates that 
the tests included in the battery varied from 
year to year. The battery was stabilized in 
1949. This battery is the one currently 
(1952) being used in the national program. 

The ACE is the American Council on Edu- 
cation psychological examination. It is one 
of the most widely used scholastic aptitude 
tests. It provides three scores,—a linguistic 
or L score, a quantitative or Q score, and a 
total score. Only the total score was used in 
this investigation. 

The Survey of Object Visualization is simi- 
lar in content to the Revised Minnesota Pa- 
per Form Board. The Survey of Natural Sci- 
ences is a 90-item test measuring facts and 
applications of principles in biology, chem- 
istry and physics. The total raw score was 
used in the analyses. 

The Carving Dexterity test is a chalk carv- 
ing test developed by the Council on Dental 
Education. The examinee uses a knife, ruler 
and pencil to carve a piece of chalk which 
measures approximately 314” in length and 
54” in diameter to correspond to an illus- 
trated figure. The carvings are graded by 
judges on the basis of objectives such as: 
“flatness of surfaces,” “clean-cutness of an- 
gles,” “symmetry” and “accuracy of repro- 
duction.” The test supposedly measures 
finger-knife dexterity as well as spatial visu- 
alization. 

GED No. 3 is the USAFI test of General 
Educational Development test three, “Inter- 
pretation of Reading Materials in the Natu- 
ral Sciences.” It measures speed of reading 
and comprehension of passages in the natural 
sciences. In 1947, 1948 and 1949 only the 


251 


252 Wilbur L. Layton 
Table 1 
viati iteria and Predictive Variables for Groups Entering the 
ER ae este 1946, 1947, 1948 and 1949 
7 1948 1949 
N31 N= 80 N = 84 N = 88 
M SD. M SD M SD M SD 
ACE 131.9 20.9 136.3 15.9 127.8 18.5 134.1 na 
Carving 13.3 3.1 11.4 2.6 9.9 2.3 ne Py 
Survey of Obj. Vis. 29.2 7.4 28.9 6.4 32.1 6.0 30.9 60 
Survey of Sciences * 55.4 8.0 55.4 6.9 55.3 a 
GED No. 3 55.5 12.1 34.0 53 34.2 5.2 34.7 iig 
GED No. 1A 16.6 4.0 16.0 4.0 15.6 3.6 i 
GED No. 1B 82.2 11.4 83.2 11.1 83.1 9.4 h 
Mich. Vocabulary 
Phys. Science 20.1 3.8 * * a 
Biol. Science 18.4 4.9 j * i 
Peterson Word Dex. 35.8 8.6 * * X ; 
Pre-dent. HPR** 14 4 1.8 4 1.8 A 1.7 F 
Frosh HPR** 1.5 4 1.7 4 7 A 1.4 m 
Four year HPR** 1.6 3 7 
Anatomy HPR** 1.6 5 1.5 5 1.5 5 1.4 3 
Phys. Chem. HPR** 1.5 6 1.5 8 1.8 7 1.3 k 
Prosthetics HPR** 1.9 4 1.9 3 1.7 4 1.8 


* Not given. 


** HPR = honor points 


first 45 out of the total 90 items were given. 

GED No. 1 is the USAFI test of General 
Educational Development test one, “Correct- 
ness and Effectiveness of Expression.” It con- 
sists of a section on spelling and a section in 


Table 2 


Coefficients of Various Predictive Indices with Fresh- 
man and Cumulative (4 year) Honor Point Ratios 
in the Dental School for the 81 Students 
in the Class Entering in 1946 


Freshman Four Year 


Predictor Grades Grades 
Freshman HPR = -83 
ACE Total lS -09 
GED Reading 23 10 
GED No. 1A 40 27 
GED No. 1B 40 24 
Word Dexterity 29 25 
Mich. Vocab, (Phys. Science) .29 12 
Mich. Vocab. (Biol. Science) 37 19 
Survey Obj. Vis. 14 49 
Carving 22 31 
Pre-dental HPR 40 Fi 

SS, 


number of credits “= 34B = 2; C= 1;D, F = O honor points), 


which corrections are to be made in punctua- 
tion, words and phrases in a connected text- 
In Table 1, A is the score on the spelling s 
tion of the test; B is the score on the tota 
test. ichi 

The Michigan Vocabulary is the Michi- 
gan Vocabulary Profile test. The two patt 
“physical science vocabulary” and “biologi 
cal science vocabulary” were included in ie 
test battery in 1946, 

The Peterson Word Dexterity test meas- 
ures specifically the extent to which the ae 
dent knows the meaning of certain suffixes 
and prefixes and measures “dexterity” at ™ 
nipulating word parts and word meanings: i 

Coefficients of correlation between the p° 
dictive indices and total first year grades a” 
grades in the three first year courses in t t 
Dental School were computed for all for" 
groups. Four year grades for the group ~~ 
tering in 1946 Were correlated with the P" 
dictive indices also, jon 

A multiple correlation and a regressio” 
equation were computed for the 1949 groYP 


A 


ka 


` 


Predicting Success in Dental School 253 


Table 3 


Correlation Coefficients of the Various Predictive Indices with Freshman Honor Point Ratio in the 
Dental School for Classes Entering in 1946 Through 1949 


iterion® Survey Pre- 
asi ACE sone Carv- of : GED dent. No. of 
HPR i i Sciences} No. 3* HPR Cases 
= .23 40 81 
43 at 27 80 
17 —.01 40 84 
43 34 42 88 


This group was selected because it was given 
the tests currently being given in the battery. 
These tests presumably are the most valid on 
a nationwide basis. . 

Table 2 presents the coefficients of correla- 
tion between the predictive indices and fresh- 
man and four year grades for the class enter- 
ing in 1946. With an N of 81 the standard 
error of r if the true 7 is zero is .11. ; 

Freshman grades correlated rather highly 
(r = .83) with total four year grades. It is 
interesting to note that the correlations of the 
predictive variables with four year grades are 
smaller than those with freshman grades 
alone. However, the Survey of Object Visu- 
alization and Carving Dexterity tests are more 
highly related to four year grades. This may 
be due to the small number of courses requir- 
ing these skills that are given during the first 
two years of the curriculum. The last two 
years of the curriculum are heavily loaded 
with practicum and clinic courses which re- 
quire these skills of the students. oon 

Table 3 presents the correlation coefficien 
for various predictive indices for each of four 


classes with freshman honor point ratio in the 
dental school. Tests given in Table 3 are the 
tests which are currently used in the National 
Testing Program plus the pre-dental honor 
point ratio. 

In three of the four years (1947 being the 
exception) the pre-dental honor point ratio 
was the best predictor of freshman honor point 
ratios in the dental school. The group enter- 
ing in 1947 was heavily loaded with veteran 
students who had taken their pre-dental work 
before the war. The difference in motivation 
for these students in their pre-war school work 
and their post-war school work may account 
for the relatively lower correlation in the 1947 
group. The findings of Hansen and Paterson 
(1) that there is a “striking increase in post- 
war scholastic achievement as compared with 
pre-war scholastic achievement of the same 
students” (all veterans) tend to support this 
interpretation. It is interesting to note that 
the ACE score did not correlate highly with 
freshman honor point ratio in the first three 
classes studied, but did for the group enter- 
ing in 1949. The Survey of Sciences test cor- 


Table 4 


Intercorrel: 


r the 88 Students in the Class Entering the Dental School in 1949 


ations of the Variables fo! 


ACE Survey of Carving Survey of GED Pre-dental 
Total Object Vis. Dexterity Sciences No. 3 HPR 
: sa 29 18 43 34 42 
Fresh. HPR 20 AT 37 43 Al 
ACE Total ó A3 29. 21 .02 
Survey of Object Vis. AS a7 At 
Carving Dexterity 46 .28 
Survey of Sciences 24 


GED No. 3 


254 


Wilbur L. Layton 


Table 5 


Correlations of Honor Point Ratios in Oral Anatomy 50-51-52, Prosthetics 50-51-52, Physiology 58-59 
with Five Predictive Measures for Four Classes of Dental School Freshmen 


1946} 1947 1948 1949 _ 
N=81 N = 86 N=90 N= 85 
a. Oral anatomy 50-51-52 HPR with: 
1. Pre-dental HPR 37 26 .26 Al 
2. ACE Total 12 .08 10 37 
3. Survey of Obj. Vis. 35 32 37 34 
4. Carving 42 30 31 24 
5. Survey of Sciences — .18 AZ 38 
6. GED No. 3 .08 al —.04 27 
b. Prosthetics 50-51-52 HPR with: 
1. Pre-dental HPR .28 ll .05 26 
2. ACE Total 00 .20 .09 43 
3. Survey of Obj. Vis. 21 29 13 16 
4. Carving 29 ll 01 01 
5. Survey of Sciences — .49 14 St 
6. GED No. 3 —.15 .09 —.14 04 
c. Physiology 58-59 HPR with: 
1. Pre-dental HPR 22 25 AL 34 
2. ACE Total .05 04 —.06 ld 
3. Survey of Obj. Vis. 07 ld 35 19 
4. Carving 18 22 36 33 
5. Survey of Sciences — .05 14 10 
6. GED No. 3 


14 50 03 39 


f Survey of Sciences Test was not given to the Freshman Class in 1946, 


relates higher with first year honor point ratio 
for the 1947 class and for the 1949 class than 
for the 1948 class. 

The fluctuation of the coefficients of cor- 
relation from year to year may be due to the 
differences from year to year in the means 
and standard deviations for the variables and 
hence differences in the composition of the 
classes, Changes in the curriculum could also 
account for the variability of the correlation 
coefficients. Also, because of the small num- 
ber of cases studied in the various classes, this 
variation may be only random variation, 

The class entering in 1949 was used for the 
computation of intercorrelations of the sey- 
eral variables, a coefficient of multiple correla- 
tion and a regression equation. 

Table 4 presents the intercorrelation of the 
variables studied. 

The regression equation and coefficient of 
multiple correlation were computed by the 
Doolittle method. Of the six predictor vari- 
ables only two, Survey of Sciences and pre- 
dental HPR, yielded significant Beta coeffi- 


cients. The coefficient of multiple correlation 
obtained was .54, 
The regression equation is: 


g= 02 X, + 39 X,—.29. 
Xı = Survey of Sciences total raw scores 
X» = Pre-dental HPR; set 
F = Predicted freshman year honor poin 
ratio. 


Table 5 presents the coefficients of correla- 
tion between five predictive indices and hono” 
point ratios in oral anatomy, prosthetics a? 
physiological chemistry for the years 1946 
through 1949, ~ 

The coefficients of correlation presented in 
Table 5 also fluctuate by variable and by 
year. This variation may be due to random 
sampling error, actual changes in the makeuP 
of the classes or changes in the curriculum. 


Discussion 


It would appear that the five tests currently 


retained in the national testing program = 
the Council of Dental Education of the Ameri 
can Dental Association are not highly relate 


Ce 


Rey 


Predicting Success in Dental School 


to grades earned by students in the University 
of Minnesota Dental School. The ACE test 
and the carving test are not predictive of first 
year honor point ratios in the course areas 
where they might be expected to have some 
relationship. Weiss (3) has reported results 
very similar to those presented here. He 
found that pre-dental grades and part of the 
Survey of Sciences test gave moderately high 
Correlations with theory and technic grades. 
Peterson (2) has reported some validity co- 
efficients for this test battery. The correla- 
tions which he reports are consistently higher 
than the ones obtained in the present study. 
Hence, the present study and that of Weiss 
illustrate the need for local validation of 
tests used in national testing programs. Tests 
which appeared fairly good on a nationwide 
basis did not show up well in the two dental 
Schools studied. 


255 


These studies also show the variability in 
coefficients of correlation one can obtain when 
several groups are studied. This means that 
findings based on one group or a nationwide 
study should be applied with caution in work- 
ing with another group for counseling or ad- 
mission purposes. The use of test data should 
be tempered by careful consideration of all 
other available data. 


Received August 25, 1952. 


References 


1. Hansen, L. M. and Paterson, D. G. Scholastic 
achievement of veterans. Sch. & Soc., 1949, 
69, 195-197. 

2. Peterson, S. Forecasting the success of freshman 
dental students through the aptitude testing 
program. J. Amer. Dent. Assn, 1948, 37, 
259-265. 

3. Weiss, I. Predicting academic success in dental 
school. J. appl. Psychol., 1952, 36, 11-14. 


THE JOURNAL OF APPLIED PSYCHOLOGY 
Vol. 37, No. 4, 1953 


Prediction and Practice Tests at the College Level’ 


Scarvia B. Anderson 


George Peabody College for Teachers + 


Attempts to predict college “success” have 
been so numerous in recent years that it is a 
rare freshman indeed who has not been sub- 
jected to a testing program of one sort or an- 
other. The battery of tests used in the pres- 
ent study may be distinguished from hundreds 
of similar batteries chiefly in that it includes 
two very easy practice tests which do not 
seem prima facie to belong with the other 
more conventional placement tests. The ques- 
tions which we shall attempt to answer here 
are “What value do these practice tests have 
in predicting freshman grade point ratio?” 
and “How does this value compare with the 
predictive values of the other tests in the 
battery?” 

The original decision to use practice tests 
in the freshman testing program at George 
Peabody College for Teachers was based upon 
the belief that within a group of entering 
freshmen there would be wide variability in 
“test-wiseness’” and ensuing diminished re- 
liability of test scores. We reasoned, on the 
basis of observations in previous years, that 
some of the freshmen would have had no ex- 
perience with objective tests and that many 
of the students would not be familiar with 
the use of special pencils and separate ma- 
chine-scored answer sheets.* 

It was felt that a preliminary testing situa- 
tion would serve two useful purposes: it would 
aid the students to become more skillful tech- 
nically, and thus to concentrate their later ef- 
forts on the subject matter of the placement 
tests rather than on the mechanics; and it 
would perhaps relieve some of their tensions 
and anxieties. 


1Dr. Julian C. Stanley, George Peabody College 
for Teachers, offered helpful suggestions and criticisms 
during the preparation of this article. 

2 The author is now associated with the Naval Re- 
search Laboratory. 

2Jt was determined later from the freshmen who 
were tested that only one-half had previously used 
separate answer sheets and special pencils, and that 
the majority of these had used them only once in a 
State testing program. 


The tests selected for practice tests were 
Otis Quick-Scoring Mental Ability Tests: 
Beta Tests, Forms Cm and Dm. These tests 
were designed for grades 4-9, so it was 
thought that very few, if any, of the Peabody 
freshmen would have had any experience with 
the tests for at least four years. 

For each of the forms, Cm and Dm. the 
time limit stipulated in the Manual of Direc- 
tions is thirty minutes. However, since the 
students’ scores were not to be compared with 
any norms set up by Otis, time limits of 15 
minutes (on the Cm) and 10 minutes (on the 
Dm) were set. An informal set of instruc- 
tions appropriate to the groups was substi- 
tuted for the standardized instructions. These 
instructions were, of course, identical for 
groups taking the tests in different rooms: 
The method of marking the separate answe™ 
sheets was explained; the students were told 
to guess if they wished, but emphasis Was 
placed on the statement that they should not 
waste time guessing *; and the groups were 
informed that the individual results on the 
Otis tests would be kept confidential an¢ 
would not go on their records or to their 4 
visers. 

The tests making up the regular fresh 
battery were given after the practice tests 2" 
in this order: 

1. American Council on Education P. sycho- 
logical Examination, 1949 College Edition» 
Linguistic Tests (ACE-L): and Quantitative 
Tests (ACE-Q). 

2. Cooperative English Test C2, Form S, 
Vocabulary (C2-V); Speed of Reading Com 
prehension (C2-S); and Level of Readiné 
Comprehension (C2-L). y 

3. Cooperative English Test A: Mechanics 
of Expression, Form T (Eng A). 

Each individual’s total scores and subscore> 
on these three tests, expressed graphic@ y 
as “normally” spaced percentile ranks bast 
upon local norms, were furnished to the fres 


man 


4 5 
Test scores were corrected for “chance.” 


256 


Ie 


kar > 


Prediction and Practice Tests at the College Level 257 
Table 1 — 
Intercorrelations and Beta Weights Involved in the Prediction of First Quarter 
Grade Point Ratio from Freshman Tests 
(N = 136) 
0 it 2 3 4 5 6 7 
- GPR Cm+Dm ACE-L ACE-Q  C2V C25 C2L  EngA 
i Cm+Dm 48 75 62 59 67 57 67 
2 ACE-L H Jl 84 78 a2 .70 
3 ACE-Q 39 40 56 53 53 
4 GV 25 69 .65 -60 
5 C2-S 42 .90 Eyi 
6 C2-L 38 51 
7 Eng a 54 
124 265 032. —.427 104 075 413 


Beta Weights 


man advisers as an aid in placement and 
counseling., 


Results 

Grade point ratios for one quarter 
three quarters were computed for the stu- 
dents who were included in the study.* Ex- 
cluded from the one-quarter analysis were 
freshmen who carried less than eight quarter 
hours and from the three-quarter analysis 
were freshmen who carried less than 24 quar- 
ter hours. A few students, who arrived late 
or who for other reasons were not present for 
the first administration of the tests, were 
Siven the tests later under less desirable con- 


and for 


è GPR was computed, letting A + =12 points per 
Quarter hour, Ae il points, A— = 10 points, B+ 


=9 points, and so on down to D ~= 1 point, F=0 


ditions, and it was felt that their test scores 
were perhaps not comparable with those of 
the original group. They too were omitted 
from the study. 

The correlations between the seven test 
scores and between those test scores and GPR 
for one quarter and for three quarters are 
shown in Tables 1 and 2. 

Note that for the first quarter the correla- 
tion between Cm + Dm and GPR exceeds all 
other correlations with GPR except that of 
Eng A; for three quarters, r between Cm + 
Dm and GPR exceeds the correlations of 
GPR with ACE-Q, C2-V, C2-S, and C2-L. 
Among these validity coefficients, however, 
only the differences between Cm+ Dm and 
C2-V for one quarter and between Cm+ Dm 
and ACE-Q for three quarters are significant 
beyond the one per cent level of confidence, 


Points, 
Table 2 
ne solved i e icti Three-Quarter 
" Jations and Beta Weights Inv olved in the Prediction of T 
Intercorrelations M Point Ratio from Freshman Tests 
(N = 119) 
— 7 i 2 3 ` 4 5 6 t 
coe ipe ACE-L ACE-Q C2V C2-S C2 Eng A 
Ss = 75 62 59 68 59 65 
: Caor DM E 49 83 76 72 71 
; OR So 40 57 53 52 
: Acma 2 68 66 61 
4 Š a 9 7 
S C25 AB i a 
6 or Al ü 
7 EngA 55 
—.115 .064 —.082 .166 .367 


.242 -013 


Beta Weights 


258 


when the significance test for the difference 
between 7’s with a common criterial array is 
H 6 
Ea regression equations shown below do 
not meet the requirements of the most eco- 
nomical predictive equations.* Rather, the 
beta weights were computed as a means of 
studying and comparing the contributions of 
the various tests to the prediction of GPR. 
The computation of the beta weights was car- 
ried out by a modified Doolittle method.’ 
For one quarter 79.1934567 = .62,° and the 
predictive equation, using standard scores, is 


Zo = 1242; + .265z2 + .03223 — 42724 
+ 10425 + 07525 + .4132;. 


1 refers to Cm + Dm, 2 to ACE-L, 3 to ACE- 
Q, 4 to C2-V, 5 to C2-S, 6 to C2-L, and 7 
to Eng A. 

The largest beta weight is that of the Vo- 
cabulary part of C2. C2-V seems to serve 
as a “suppressor” variable. The hypothesis 
could be advanced that the abilities measured 
by this test are antithetical to those evaluated 
by the first-quarter freshman English course 
teachers at Peabody. Certainly no generality 
can be implied from this scant evidence, how- 
ever, especially when one compares this beta 
weight with the three-quarter beta weight of 
064 (see Table 2) and considers “sampling 
fluctuations.” 

The various beta weights indicate the rela- 
tive contributions of the independent vari- 
ables to the prediction of GPR. Therefore, 
the contribution of Cm+Dm is almost one- 
half that of ACE-L, is approximately four 
times that of ACE-Q, is about 14 times that 
of C2-S, is approximately 1% times that of 
C2-L, and is almost one-third that of Eng A. 

For three quarters, the multiple R between 
GPR and the seven test scores is 59. (This 
R corrected for shrinkage becomes 56.) The 
predictive equation, using the same notation 


®McNemar (3, pp. 124-125). 

7 See, for example, Thorndike (4, pp. 201 £.), 

8 Thorndike (4, pp. 335-339). 

® Fisher (1) and Wherry (5) have reported on the 
amount of bias in a multiple R when it is applied to 
a new sample and the correlation with a similar cri- 
terion computed. In such a situation we may antici- 
pate that the new R will be smaller than the original 
R. The multiple R of .62, corrected for such shrink- 


age, is .59. See Kelley’s (2) Formula 12:36, p. 474, 


Scarvia B. Anderson 


Table 3 


Intercorrelations and Beta Weights Involved in the 
Prediction of Three-Quarter Grade Point Ratio 
from Combined Scores on Freshman Tests 


(N = 119) 
IV 
0 I u om | 
Cm+ ACE C2 Eng 
GPR Dm Total Total f 
ICm+Dm 50 80.68 T 
ILACETotal 48 84 
III C2 Total 47 E 
IV Eng A .55 
3 
Beta Weights 220 —.090 171 36 


as above, is 


20 = 2422, + 01329 — .115z3 + 06424 
— 08225 + 16625 + 36787 


Here the contribution of Cm + Dm is abor 
two-thirds that of Eng A and exceeds all othe 
contributions, : 

It is important to mention that “criterion 
contamination” may be present in the on 
all of the predictors except Cm+ Dm. t 
scores on Cm and Dm were the only u 
scores which were not made available to t 
teachers, and therefore they could have lig 
no possible influence on the grading. C™ ee 
Dm contributed substantially under a 
ideal conditions. ‘on 

In arriving at a third predictive nana 
the scores on the tests were combined 
shown in Table 3. for 

The multiple R is 59 (.57, corrected aeS 
shrinkage). The regression equation becon 


Zo = 22021 — 0902 + 1712r + 36981" 


I refers to Cm+ Dm, II refers to ACE ne 
III to C2 Total, and IV to Eng A. rth 

The'use of Cm+ Dm and Eng Aina wee 
Predictive equation, using again three-d” At 
ter GPR, indicates that the inclusion of per 
Dm does not greatly enhance the original e 
relation between Eng A and GPR, thous! ore 
multiple R, of course, represents 4 qhe 
stable measure. The multiple R is -57- 
regression equation becomes 


Zo = 2122; + .412Z1v, 


Prediction and Practice Tests at the College Level ` 


where I refers to Cm+ Dm and IV refers to 
Eng A. 

It is interesting to note that in every case, 
Eng A is represented by the highest validity 
Coefficient and the largest beta weight. The 
fact that English is a required course for 
freshmen at Peabody and mathematics is not 
may help explain the larger predictive value 
of some of the English tests, as compared with 
ACE-Q; but it does not contribute much to- 
ward an explanation of the differences be- 
tween Eng A and the other English tests. In 
Considering other factors extrinsic to the test 
itself which might account for some of the 
high degree of relationship between Eng A 
and GPR, the fact that Eng A was adminis- 
tered last comes to mind. The time limit (40 
minutes) was adequate for far more than half 
Of the testees to finish Eng A; the same was 
Not true of any of the other tests. Perhaps 
in addition to knowledge of certain English 
fundamentals, Eng A tested, for this group, 
Motivation and perseverance to a greater de- 
Stee than any of the other tests. Perhaps, 
further, these qualities are closely related to 
freshman GPR at Peabody. 


Summary 


1. Two easy tests (Otis Quick-Scoring Men- 
tal Ability Tests, Beta Tests, for Grades 4-9, 
Forms Cm and Dm) were administered to a 
group of entering college freshmen for the 
Purpose of giving the students practice on ob- 
jective tests and acquainting them with the 
mechanics of a machine-scored answer sheet. 


259 


2. These easy practice tests showed some 
predictive values above those of several more 
conventional placement tests; namely, Ameri- 
can Council on Education. Psychological Ex- 
amination (1949 College Edition) and Co- 
operative English Test C2 (Form S). The 
usefulness of the inclusion of the Otis tests 
in such a battery is indicated. 

3. The Cooperative English Test A: Me- 
chanics of Expression (Form T), in every 
combination with the other placement tests, 
contributed most substantially to prediction 
of freshman GPR. 

4. The numerically largest beta weight in 
the one-quarter regression equation was that 
of C2-Vocabulary, which in this sample ap- 
peared to act as a “suppressor” variable. The 
corresponding beta weight in the three-quarter 
predictive equation was small but was posi- 
tive instead of negative. 


Received August 26, 1952. 


References 


1. Fisher, R. A. Influence of rainfall on the yield of 
wheat at Rothamstead. Philos. Trans., 1923, 
213, 89-142. 

2. Kelley, T. L. Fundamentals of statistics. Cam- 
bridge: Harvard Univ. Press, 1947. 

3. McNemar,Q. Psychological statistics. New York: 
Wiley, 1949. 

4, Thorndike, R. L. Personnel selection: test and 
measurement techniques. New York: Wiley, 
1949. 

. Wherry, R. J. A new formula for predicting the 
shrinkage of the coefficient of multiple corre- 
lation. Ann. math, Statist., 1939, 2, 440-457. 


wn 


THE JOURNAL oF APPLIED PSYCHOLOGY 
Vol. 37, No. 4, 1953 


Development of a Short Test to Predict a Complex 
Aggregate Score ` 


Helen Tomlinson and John T. Preston 


USAF Training Command, Human Resources Research Center, Personnel Research 
Laboratory, Lackland Air Force Base, San Antonio, Texas 


In a large organization, job assignment in- 
volves administration to potential employees 
of a wide variety of tests to predict proba- 
bility of success in different kinds of assign- 
ments. This is an expensive procedure, and 
most organizations which require continuous 
hiring have worked out means of spotting ap- 
plicants who have little chance to prove suc- 
cessful in any current job openings. One of 
these is the short preliminary test with cut-off 
scores based on empirically determined prob- 
abilities of meeting qualifying requirements 
for job appointment. 

This article describes a technique for con- 
structing a short test to determine probability 
of reaching a qualifying score on an aggregate 
score empirically weighted for prediction of 
success in one group of training schools. ‘The 
method was developed in connection with the 
Service problem of recruitment to meet man- 
power needs in a specialized job area. 


Method 


In the Air Force problem, the required short 
test is to predict probability of reaching a 
qualifying cut-off on a weighted aggregate 
score, the aptitude index. Each aptitude 
index combines the variables of the Airman 
Classification Battery for optimum predic- 
tion of training success in one cluster of re- 
lated technical specialties. 

A preliminary form, twice the length speci- 
fied for the final predictor test, was composed 
of blocs of items, with each bloc patterned 
after one of the components of the aptitude 
index. Analysis of results from a try-out of 
the preliminary form determined selection of 
items for the final form. Test scores included 
in the aptitude index are converted to nor- 
malized standard AF scores before 


: weighti 
them into the aggregate, ee 


Sections of the pre- 

1The views expressed in this arti 
the authors and do not necessarily rj 
cial views of the United States Air 


cle are those of 
epresent the offi- 
Force. 


260 


dictor test are weighted by the number of 
items so that the total raw score is appr0 
priately weighted. Table 1 shows the com 
position of the criterion, and the preliminary 
and final forms of the predictor. 

Items in five of the six subtests were 5°- 
lected from appropriate item pools for se 
nificant discrimination against the correspon’ 
ing test of the aptitude index. Experienci 
items were selected and keyed for positiv 
correlation with both the Biographical . 
ventory score and the aptitude index. rot 
cause of the restricted choice of items a 
Subtest 6, a group of items in a related are : 
Test VII, was included in the preliminary 
form for possible use if the items of Subtest 
did not hold up in the analysis. 


s SAn $ adminis- 
The preliminary 64-item form was adm! 


. ad 
tered to two unselected samples which he 
about equal representation of men a ack 


for basic training to Sheppard and to 
land Air Force Bases. 

For analysis purposes, each sample of 
was split into the Sheppard group a” 
Lackland group. Table 2 shows the 
position of the samples and score distri! 
Statistics for the preliminary test. the 
terion, and a vocabulary test. 


370 
the 
com” 
pution 
cri- 


Table 1 m 

` Di ‘cee PO 
Composition of the Criterion, the Preliminaty fo 
and the Final Form of the Predictor _= 


Criterion edict ae 
he Paes 


ns 
No. of Tien 


No. of 5 ——— Final 
Test liens Weight m4 prelim. 5 
I 35 2 = 1 12 6 
I i8 9 2 12 3 
IO 300 1 3 7 6 
IV 30 2 4 10 3 
V i5 4 5 6 6 
VI 30 2 6 10 > 
VII 20 3 7 Gee 


< onal 


A Short Test to Predict a Complex Aggregate Score 261 
> Table 2 
pt Raw Score Distribution Statistics for the Preliminary Test and Standard AF Score Distribution 
d Statistics for the Aptitude Index and a Vocabulary Test 
Preliminary Aptitude Vocabulary Correlation of 
Pes Index Test Preliminary 
SS ad Test with 
N M SD M sD M SD Aptitude Index 
Pee Ee ee ee sei AENEA lex 
Sample I 
Lackland 194 41.1 11.0 5.5 2A 6.0 1.8 80 
Sheppard 176 38.2 10.7 54 21 55 19 87 
Sample II 
Lackland 193 41.2 7 5.6 2.2 5.6 2.0 86 
Sheppard 177 38.3 5.2 2.1 5.3 20 86 
Table 3 
Distribution Statistics and Correlations of the Predictor* with the Criterion 
. Predictor Aptitude Index Correlation of 
J ————— Predictor with 
N M SD M SD Aptitude Index 
Sample I 7 Pap 
Lackland 194 19.5 5.8 5.5 87 
Sheppard 176 SET 5.9 5.4 85 
Total 370 18.7 5.9 5.5 2.1 .85 
Sample II n 
Lackland 193 19.4 6.1 5.6 2.2 86 
Sheppard 177 17.8 6.0 5.2 2.1 84 
Total 370 18.6 6.1 4 2.1 85 
Sample ITI i 
(Independent) $82 17.0 6.6 49 2.1 86 


* Scoring keys for the selected 30 items applied to 


Selection of items for the final form was 
ased on several criteria: (1) Correlation 
(Ø) with the aptitude index; (2) Correlation 
1 (Ø) with the parent test; (3) Lower discrimi- 
nation for other tests of the aptitude index 
Composite than for the parent test; and (4) 
ifficulty index centering around 60% R (un- 
Corrected). ne 
Effect of chance error was diminished by 
running the item statistics separately for the 
two samples. Instead of preparing separate 
30-item keys for each sample, the criterion of 
Consistency between samples was added for 
Producing a single 30-item key- 


Efficiency of Prediction 


Application of the 30-item key to the a 
¿ Perimental answer sheets yielded the distri u 
» tion statistics and correlations appearing in 


answer sheets for the preliminary test. 


Table 3. The key was developed on Samples 
Tand II. Sample III is independent. 

Since the new test is essentially a short 
form of the aptitude index, the predictor has 
a special relation to the criterion. The Kuder- 
Richardson estimate of reliability (Formula 


Table 4 


Correlations with Control Aptitude Index 
(Sample: 324 Lackland Airmen) 


Correlation 
with 
Control 
Aptitude 

Index 


Control Aptitude Index 5 21 
Criterion Aptitude Index 5 21 Th 
Predictor Score 19.9 5.7 


.65 


2 
PROBABILITY OF MAKING Al 25 
a 
o 


9 10 


Fic. 1. 


"12 13 Ta 
PREDICTOR TEST—RAW SCORE 


Helen Tomlinson and John T. Preston 


THE PROBABILITY TABLE 


PREDICTOR 
SCORE 


27 AND OVER 


PERCENT MAKING 


8 AND BELOW 


TOTAL 1622 


o 
29 3 
15 16 I7 I8 19 20 21 22 23 24 25 26 27 26 


on 
Curve showing probability of making an aptitude index of 5 or more for each raw score 


the predictor test, 


20) is .84. The estimate obtained by cor- 
relating scores for the final form against scores 
for the remaining 34 items is .82. These re- 
liability coefficients do not differ significantly 
from the coefficients of prediction (.84 — 87), 

To show that the predictor test is specifi- 
cally predictive of the criterion, the correla- 
tion was computed with a control aptitude 
index of median correlation with the criterion 
aptitude index and maximum correlation with 
the Armed Services Qualification Test. Table 
4 shows these correlations. 

Because of common components in the eight 
Air Force aptitude indexes, their intercorrela- 
tions are high. Consequently the correlation 
of the predictor with the control aptitude in- 
dex (.65) is less than the correlation between 
the criterion aptitude index and the control 
aptitude index (.77). 

Rather than a direct score conversion table 
empirical probabilities for achieving an apti- 
tude index of at least 5 were c 


3 Omputed from 
the combined sample of 1622 for each pre- 


dictor score directly from the 30 x 9 repe 
sion chart, Figure 1 shows that the oue 
percentages closely approximates the expe zeen 
ogive. The steepness of the curve i ble 
Predictor scores 16 and 17 suggests a Te i j: 
cut-off for a better-than-even chance of qU 


fying for training in the specific job area 


Summary 


i ning 
This method of constructing short scree 
tests offers two advantages: ay di- 
1. The screening process can be arg 
rected to current personnel requiremen™ Je 
2. A simple “Rights” score read into 4 cleri- 
of probabilities makes it possible to use train” 
cal assistants with only a minimum of rpret- 
ing for administering, scoring, and inte 
ing the tests, junc 
The method is applicable only in conse. 


tion with reliable, well-validated selectio? 
struments, 


Received November 10, 1952. 


Pieco] 


—— 


Tue JOURNAL OF APPLIED PSYCHOLOGY 
Vol. 37, No. 4, 1953 


The Classification of Occupations by Means of Kuder Interest 
Profiles: I. The Development of Interest Groups 


John L. Holland, Allen H. Krause, M. Eloise Nixon, and Mary F. Trembath 


Vocational Counseling Center, 


The need for a more extensive knowledge 
of occupational interests is well recognized 
by vocational counselors. The empirical evi- 
dence is limited largely to the data concern- 
ing the Strong Vocational Interest, Blank and 
the Kuder Preference Record. Since these 
inventories have been applied to only a small 
number of occupations, it is frequently nec- 
essary in practice to speculate about the na- 
ture of the interests for many occupations. 
An “unknown” occupation presents at least 
two problems: (1) What are the character- 
istic interests of this occupation?, and (2) In 
what known interest group does this occupa- 
tion belong? 

These problems are commonly approached 
by developing interest keys using Strong’s 
method or by securing Strong interest profiles 
for unknown occupations. Since these meth- 
ods present practical and financial problems, 
a Simpler, less expensive method of measur- 
ing and classifying interests for a large num- 
ber of occupations would accelerate the ex- 
tension of our knowledge of interests and in- 
terest groups. 
_ This study is an attempt to classify occupa- 
tions empirically using KPR profiles and to 
Present a research tool for validation. While 
the sets of interest groups developed here are 
empirical in nature, their predictive power, 
€specially with respect to job satisfaction, is 
argely unknown. It is hoped that others will 

€ able to test and extend this classification 
System by making cross-comparisons with the 
Strong, and by predicting job satisfaction or 
Vocational choice. 


Method 


A group 45 KPR profiles representing 
the eo scart groups for men, as defined 
by the SVIB, was selected from the Kuder 
manual of instructions (4). This selection 
Was made since the Strong groups represent 
a classification system supported by consider- 


Western Reserve University 


able evidence including a number of factor 
analyses by Strong (6) and Thurstone (8). 
It is assumed that a new method of classifica- 
tion would reveal comparable interest groups. 

About 27 of the 45 occupations used in this 
sample are similar to, if not identical with, 
the Strong sample of 44 occupations. Eighteen 
additional Kuder profiles were used to define 
more accurately the limits of a given interest 
group, and to increase the representativeness 
of the group by employing a larger sample of 
occupations. This need is especially apparent 
in the case of groups defined by one to three 
occupations. 

A second group of 42 profiles was secured 
for women. No attempt was made to select 
these profiles in terms of occupational groups; 
however, apprentice and military occupations, 
except for aviation assembly and repair 
worker, were avoided. Also, housewives as 
an occupational group were not used since the 
Kuder sample is limited to only three groups 
of housewives (wives of lawyers, physicians, 
and farmers) and may accordingly be quite 
unrepresentative of housewives-in-general. 

For females, the Kuder sample contains 
about 17 of the 25 occupations in the present 
Strong Blank. The twenty-five additional oc- 
cupations again were used to define the limits 
and to increase the representativeness of the 
interest groups. 

Using rho as an index of profile similarity, 
the Kuder profiles were intercorrelated.*. By 
grouping profiles which intercorrelate .70 or 
greater, a set of interest groups was derived 


1Profiles were coded by listing the highest scale 
first with the remaining scales following in descend- 
ing order. For these computations the outdoor scale 
was omitted since it was available for only a few 
occupations. The matrices formed by the intercor- 
relation of profiles (16 pages) are deposited with the 
American Documentation Institute. Order Document 
3725 from American Documentation Institute, Li- 
brary of Congress, Washington 25, D. C., remitting 
$1.00 for microfilm (images 1 inch high on standard 
35 mm. motion picture film) or $2.40 for photocopies 
(6X8 inches) readable without optical aid. 


263 


264 


for each sex. This procedure consisted of a 
simple cluster analysis. Each profile was in- 
spected for its pattern of intercorrelations. 
Occupational groups were formed by classify- 
ing profiles having similar intercorrelational 
patterns. In most cases the evidence for a 
particular grouping is clear-cut: however, 
when a profile correlates with about equal fre- 
quency with the occupations of two or more 
groups, it is listed in each group in order to 
give an accurate picture of the data. 

It seemed meaningful to make several ex- 
ceptions to this method of classification. In 
the case of males, the profiles for farmer, 
aviator, carpenter, and forest supervisor are 
placed in the interest group with which they 
show the highest single correlation. For fe- 
males, this relaxation of the criterion was 
made for stenographer and typist. and sales 


J. H. Holland, A. H. Krause, M. Eloisé Nixon, and Mary 


F. Trembath 


clerk. Although these profiles fail to meet 
the criterion for inclusion, their intercorrela- 
tional patterns are similar to the typical pat- 
tern of their respective groups. 

In order to secure a useful arrangement of 
occupations within a given interest group, og- 
cupations are ordered in terms of their “rep- 
resentativeness.” The occupation which shows 
the greatest number of correlations equal to 
or greater than .70 with the other members of 
the same group is listed first in that group. 
This occupation is designated as the “core” 
occupation: The remaining occupations are 
then arranged in descending order of correla- 
tion with the core profile. Ties in rank were 
resolved by comparing the number of correla- 
tions equal to or greater than .70 for each of 
these occupations with the other occupations 
in the same group. The relative position © 


Table 1 
Kuder Interest Groups (Male) 
Group Occupation Rho Code* N 
T. Skilled and Technical = aa 
Asst. District Rangers 28 
"e 0132568794 a 
ee ae 98 1305286974 37 
Vocational Training Teachers .87 1153826974 35 
Machinists E 80 1153270864? 117 
Meteorologists (2) 80 3161275894 185 
— 78 3216547098” 653 
Ree -17 513279684 216 
P ea .52 317965284 34 
Carpenters 39 0913825647’ 129 
II. Managerial 32 "1058973246" et 
Production Managers m 139 
Engineers (3 2631495807 
mei a "3216547098" pe 
III. Scientific aa 239164758 si 
Psychologists 
H. S. Teachers of Mathematics = 362857914 w 
Laboratory Technicians aT 23’867519"4 9 
All Physicians and Surgeons a 356278914" e 
Chemists 1 3658071294 
District Rangers 76 3167251849 54 
Meteorologists (2) 73 0256381749" 102 
Protective Service Occupations 361275894 e 
Engineers (3) à 6870351249" 
IV. Drugstore Managers and Pharmacists 8 3216547098 = 
— "342987561 140 


* Scales to the left of the first apostrophe indi 
indi 
of the last apostrophe are at the 25th E 


groups. 


cales at or abov 5 

5 ty e the 75 i 

** The number in parentheses following occupational titl Tesi th percentile. Scales 
e design: 


à interes 
ates occupations listed in two or three int 


« 


Ay 


Occupation Classification by Kuder Interest Profiles 265 


Table 1—Continued 


Group 


Occupation 


aa 


an . ERT 
B e cupation within its group ser 
* crude index of its communality 


V. Welfare 


H. S. Teachers of Social Studies 
School Administrators 

Social and Welfare Workers 
Forest Supervisors 


Clergymen 
VI. Clerical 


General Office Clerks 
General Accountants 

Office Mgrs. and Chief Clerks 
Cost Accountants 

Printers and Pressmen 
Financial Institution Clerks 
Lawyers and Judges (2) 
Purchasing Agents and Buyers (2) 


VIL Expressive 
Sales Managers 


Life Insurance Salesmen 
Salesmen, to Consumers 

Reg. Salespersons, Dept. Store 
Stock and Bond Salesmen 
Advertising Agents 


Route Salesmen 


Authors, Editors, and Reporters 


Personnel Managers 
Banking, Finance, 


Lawyers and Judges (2) 


Purchasing Agents 
Musicians and Musi 


Commercial Artists 


and Insurance Officials 


and Buyers (2) 
c Teachers 


Rho Code* N 
876243795'1 23 

88 862934751 65 
.83 864392571 53 
-68 0682543197’ i 
63 867932541 43 
9267438501’ 110 

98 29°64730851’ 92 
91 2964758301 138 
.91 296475318 28 
83 976253418" 32 
78 298674351 25 
AS 6'97428503'1 331 
13 ”469278531’ 103 
4768953201’ 230 

.98 47869532'1 50 
96 4678935012’ 353 
.90 4’7965823"1 184 
.90 487965231 118 
-88 647589312 26 
87 487569312’ 104 
85 6794850321 113 
.83 4867923051 50 
.82 24796258031 42 
.15 6974285031 331 
.70 7469278531’ 103 
.70 765894321 77 
50 5°6794281°3 31 


gness, 


. Each set of interest groups was arr 


Shans 
SPecting the matrix formed by the 


CUpations, 


ae The highest neg 
A i a pair of core occupa 
rst and last interest groups, 
haat interests are most divergen 
Brou ining groups are place 
PS in accordance with 


sa 
th the first group listed.” 


tates, The classification of 
ed seven interest groups which hav 


2 . 5 
T the remaining groups are arranged in accord 


Results 


e 
Bouse their correlation 


hal à 


V and VI will change posi 


r 2 x 
rangement is essentially 


with the la 
ition, but 
the same. 


yes then as 
or belong- 


anged by 


core 0C- 


ative correlation 
tions designates 
r the groups 


t.. The 


d between these 
their correlation 


profiles pro- 


e been 


st group, 
the origi- 


designated tentatively as: I. Skilled and Tech- 
nical; II. Managerial; III. Scientific; IV. 
Drugstore Managers and Pharmacists; V. 
Welfare; VI. Clerical; and VII. Expressive. 
Table 1 shows the male sample classified into 
these groups. To increase the usefulness of 
Table 1, the rko with the core occupation, the 
N, and the code ë for each occupation have 
been listed. 

In general, the obtained groups are similar 
to related Strong Groups; the Kuder Groups 
LIL III, V, and VI agree in content with the 
Strong Groups IV, III, I, V, and VIII, respec- 
tively. The differences between comparable 
groups seem slight. 


3 The attention of readers is directed to the mis- 
leading character of a mechanical coding device for 
counseling purposes that isolates percentiles of 75 or 
greater or percentiles of 25 or lower. See Diamond, 
$. The interpretation of interest profiles. J. appl. 
Psychol., 1948, 32, 512-520.—Editor. 


266 J. H. Holland, A. H. Krause, M. Eloise Nixon, and Mary F. Trembath 


Table 2 
Kuder Interest Groups (Female) 


7 
Group Occupation Rho Code* N 
utational 2 
R E, Teachers of Mathematics 23159786? - 
Office Machine Operators 93 213598746" 0 
Tearoom and Restaurant Mgrs. AS} 235146978 i 
Hospital Dietitians** (2) -10 321765849" 3 
Statistical Clerks 43 236574189" 
II. Scientific and Technical 
H. S. Teachers of Home Economics 7513867249" 136 
Aviation Assembly & Repair Wrkrs. 90 1735786249 a 
Cooks and Bakers 88 ”16347582’9 
Physicians 87 318562749 43 
All Trained Nurses 83 835176249" 1071 
Supervisors and Head Nurses -70 "831752649" 196 
Dental Hygienists .70 8357146279 35 
Laboratory Technicians 70 31°527486'9 31 p 
Occupational Therapists -63 157386249 70 Y 
Home Demonstration Agents .62 2851473269 24 
Hospital Dietitians (2) 43 321765849 31 
III. Clerical 
H. S. Teachers of Commercial Subjects 294756138’ 64 
General Office Clerks 85 1295471368" 136 
Bookkeepers 80 296754381 62 
Stenographers and Typists 32 1973562148" 235 
IV. Linguistic 
Journalists 6°5473182°0 31 
Copy Writers, Mail Order Co. 95 645713289 19 
Librarians 78 6737149328” 39 
Artists and Art Teachers 75 5'167483°29 22 
H. S. Teachers of Language (2) 75 675489132 a 
H. S. Teachers of English (2) Fi 675482193’ 11 
V. Expressive 
Assistant Buyers, Dept. Store ee, , 58 
Personnel Mgrs., Mail Order Co. 87 pene 34 i 
H. S. Teachers of English (2) 86 675482193 110 { 
H. S. Teachers of Social Studies 80 684527193" 56 
Social Workers 7 864753129 50 
H. S. Teachers of Language (2) 15 675489132 42 
Personnel Wrkrs. other than Mgrs. 73 486217953 27 
Musicians and Music Teachers 70 75684192°3 68 
Cashiers 70 ues i : 43 
Retail Buyers "70 ee 29 
Salespersons, Dept. Store, Reg. 10 j vale 617 
Religious Workers ns 47389621 31 
Floor and Section Mgrs., Dept. St. ‘65 a 25 | 
Secretaries n 5 121 
Executives, Mail Order Company e 4019521587 61 
Teachers, Primary and Kindergarten 50 4 27653981 544 
Office Mgrs., Chief Clerks 40 ine 29 | 
Sales Clerks E > 26 
Telephone Operators = 849761325’ 22 
* Scales to the left of the first apostrophe indi De the ight ” 
of the last apostrophe are at the 25th percentile nigh scales at or above the 75th percentile. Scales tO d 
** The number in parentheses following oc M, tere’ 
groups. 


Cupational title designates occupations listed in two OT three 


A 


Occupation Classification by Kuder Interest Profiles 267 


In the Kuder classification, engineers are 
classified in the skilled, scientific, and mana- 
Serial groups rather than in science alone as 
in the SVIB. District rangers are classified 
in both the skilled and scientific groups rather 
than in science alone. Protective service oc- 
Cupations are classified in science rather than 
in skilled trades. Personnel managers are 
classified in the expressive group rather than 
social service. Lawyers are classified in the 
clerical and expressive groups rather than lan- 
guage occupations only. Printers and press- 
men are classified in the clerical group rather 
than in the skilled trades. Purchasing agents 
and buyers are classified in both the clerical 
and expressive groups rather than in business 
detail alone. Pharmacists and drugstore man- 
agers form a separate group in the Kuder 
data while they fall in Group VIII on the 
Strong. : 

These differences are due in part to the 
classifying of occupations in one or more 
groups by the KPR, and to the limiting of 
an occupation to a single group by the SVIB. 
Consequently, only four of the above occupa- 
tions are placed in opposed interest groups: 
Protective service occupations, personnel man- 
agers, printers and pressmen, and pharmacists 
and drugstore managers. ; 

The creation of Group VII, Expressive Oc- 
Cupations, presents the greatest difference be- 
tween these interest systems. Group VII in- 
Corporates Strong Groups IX and X and a 
number of other occupations: salesmen to 
Consumers; regular salespersons (department 
Store) ; stock and bond salesmen; route sales- 
men; personnel managers; banking, finance, 
insurance officials; purchasing agents and buy- 
ers; musicians and music teachers; and com- 
mercial artists. ‘The cohesiveness of Group 
VIT is marked. The first four occupations in- 
tercorrelate .70 or greater with each other. 
Furthermore, the median rko for the entire 
8roup of 14 occupations with the core occupa- 
tion, sales manager, is .86. , 

While it is possible to fractionate Group 
VII into three sub-groups consisting of sales, 
language, and art (including music), there 
is no empirical justification for such groups 
Since the correlational patterns show no sharp 


differences. This procedure would, of course, 
make the two interest systems appear more 
alike. 

Females. The correlational matrix for fe- 
males produced five interest groups: I. Com- 
putational; II. Scientific and Technical; IIT. 
Clerical; IV. Linguistic; and V. Expressive. 
The comparability of these groups with the 
13 groups shown by Strong (7) is difficult to 
ascertain. The Strong groups are defined fre- 
quently by a single occupation so that their 
limits are not clearly structured. 

Kuder Group I, Computational, includes 
Strong Group XII (mathematics and science 
teacher) and three additional occupations not 
found in the Strong sample: office machine 
operators, tearoom and restaurant managers, 
and statistical clerks. Hospital dietitian in 
Strong Group IX falls in this group and also 
in Group II, Scientific and Technical. 

Kuder Group II, Scientific and Technical, 
includes Strong Groups IX, XI, and XIII 
(home economics teacher, dietitian, occupa- 
tional therapist, nurse, laboratory technician, 
and physician) as well as a number of occu- 
pations not contained in the SVIB. 

Kuder Group II, Clerical, is a rough ap- 
proximation of Strong Group VII as both 
groups include high school teachers of com- 
mercial subjects, general office workers, book- 
keepers, and stenographers. 

Kuder Group IV, Linguistic, combines 
Strong Groups I and II (artist, librarian, and 
English teacher) and a number of language 
occupations not included in the SVIB. 

Kuder Group V, Expressive, combines 
Strong Groups II, IV, VI and VIT (so- 
cial worker, social science teacher, elementary 
school teacher, and buyer) as well as a num- 
ber of related sales, social service, and ad- 
ministrative occupations. High school teach- 
ers of English which occur in Kuder Group 
IV are also listed in Group V. 

In general, the Kuder set of interest groups 
appears to be a coarser classification system 
than the Strong set, but the direction of their 
classification appears essentially the same. A 
more accurate picture of the relationships be- 
tween these systems can be obtained by a 
comparison of their correlational matrices. 


268 J. H. Holland, A. H. Krause, M. Eloise Nixon, and Mary F. Trembath 


Discussion 


The occupational classification system pre- 
sented here is limited by the original data. 
The representativeness of most of the occu- 
pational profiles is largely unknown; fur- 
thermore, each criterion group contains both 
successful and unsuccessful, experienced and 
inexperienced, as well as satisfied and dis- 
satisfied workers (4). The usefulness of this 
interest system in interpreting individual pro- 
files is consequently restricted. 

The validity of these interest groups is 
problematical. Tests of significance are in- 
appropriate because of the large number of 
computations and because of the unorthodox 
use of rho; that is, varying N’s and yet only 
eight degrees of freedom for each correlation. 
The use of rho as an index of profile simi- 
larity violates the assumptions underlying its 
application. Moreover, as Guilford (3) points 
out, the “ipsative property of the Kuder 
scores . . . renders their use for intercorrela- 
tions among themselves . . . so questionable 
as to preclude attempts at analysis by the R- 
technique.” 

These difficulties pose a choice of abandon- 
ing the evidence until more adequate data can 
be obtained and statistical elegance can be 
achieved, or to test the interest groups by 
means of a prediction study. The latter al- 
ternative seems desirable in view of the need 
for an immediate understanding of the present 
data. Although a predictive test may make 
statisticians wince, it will yield an estimate of 
the predictive power of the classification. 

While they furnish only suggestive and in- 
complete evidence, several related studies sup- 
port the Kuder interest groups. In a factor 
analysis of the SVIB, KPR, Bell Inventory, 
and MMPI, Cottle (2) extracted seven fac- 
tors similar to those described by Thurstone 
(8) and Strong (6). Cottle’s interest fac- 
tors support the Kuder classification includ- 
ing Group VII, Expressive, which combines 
Strong Groups IX and X in addition to a 
number of other occupations. The factor G 
obtained by Cottle appears similar to the 
Kuder Group VII since it is characterized by 
high positive loadings on Strong Groups IX 
and X and high negative loadings on the 
Kuder mechanical scale. Similarly, an in- 


spection of the present Kuder matrix indi- 
cates that the expressive occupations correlate 
most negatively with the skilled trade occu- 
pations. 

In addition, Cottle’s matrix for the SVIB 
with the KPR reveals that the Strong Group 
keys correlate with the KPR scales in a pat- 
tern which is similar to the coded profile for 
the comparable Kuder interest group. The 
positive and negative correlations of Kuder 
scales with a given Strong group scale produce 
coded profiles which are similar to those in 
the Kuder Groups. For example, the code 
for high school teachers of social studies, the 
core occupation for Kuder Group V, Welfare, 
is 862437951. Cottle’s matrix produces the 
code 867495213. The other five group scales 
show similar relationships with the Kuder in- 
terest groups. 

In a study of occupational test patterns, 
Barnette (1) has supplied Kuder profiles for 
“success” and “failure” samples for five 0C- 
cupational groups: engineers, accountants, 
clerical personnel (except accountants), a $P& 
cialized clerical group (primarily verbal F 
nature), and salesmen. When the pairs ° 
“success” and “failure” profiles are correlate A 
with comparable Kuder core occupations, SÌX 
of the seven tests reveal that the “success 
groups correlate higher with the core occuP4” 
tion than do the “failure” groups. The data 


Table 3 


The Relation of Occupational Success and Failure 
to Core Occupations of Corresponding 
Kuder Interest Groups* 


Occupations N I H mm VI va 
Engineers S 83 .83 .66 44 
F 39 .59 .21 —.09 
Accountants S 74 22 
F 24 85 
Clerical S 20 he 
(Specialized) F 44 a 
Clerical S 54 3 
(General) F 40 As 
86 
Salesmen S 77 ‘of 
F 56 


" i a 
Correlations are rho coefficients. 


Occupation Classification by Kuder Interest Profiles 269 


Suggest that the degree of similarity of an in- 
dividual profile to a given interest group is 
related to success or failure in the occupation. 
Table 3 shows these relationships. 

The classification of occupations by means 
of the entire profile appears to have several 
advantages which other systems such as the 
method proposed by Weiner (12) lack. Sys- 
tems of classification which rely on the two 
or three highest scales, or the scales above a 
certain cutting score, utilize only a portion of 
the data. There is no experimental evidence 
revealing that “high” scores are more signifi- 
Cant or useful than “low” scores. In practice, 
a coding system which uses both low and high 
Scores impels the counselor to notice and use 


More of the data. Finally, a total code mini- 


Mizes the misclassification of occupations. If 
only the high scores are used to categorize an 
Occupation, occupations which have similar 
Patterns may be sorted into different groups 
by means of one or two scales which may Tep- 
Tesent the only differences between a pair of 
Profiles, ; 

The results suggest further that previous 
Studies of the SVIB and KPR (5, 9, 10, 11, 

3) may be in error in concluding that these 
instruments show little relationship. Their 
findings may be largely negative due to an 
inappropriate level of analysis; that is, a 
Single scale comparison rather than a pattern 
method. 


Summary 


A sample of KPR profiles was intercorre- 
ated and classified by means of a simple 
Cluster analysis. Sets of occupational groups 
or men and women were derived which are 
Comparable with many of the SVIB groups. 

hile these groups are empirical in nami 
their validity is unknown. It was suggeste 
hat validation might be established by pre- 


diction studies of job satisfaction and voca- 
tional choice. 


Received August 29, 1952. 


References 


1. Barnette, W.L. Occupational aptitude patterns of 
selected groups of veterans. Psychol. Monogr., 
1951, 65, No. 5 (Whole No. 322). 

2. Cottle, W. C. A factorial study of the Multi- 
phasic, Strong, Kuder, and Bell inventories 
using a population of adult males. Psycho- 
metrika, 1950, 15, 25-47. 

3. Guilford, J. P. When not to factor analyze. 
Psychol. Bull., 1952, 49, 26-37. 

4. Kuder, G. F. Examiner manual for the Kuder 
Preference Record, vocational, form C, second 
revision. Chicago: Science Research Associ- 
ates, March 1951. 

. Peters, E. F. Vocational interests as measured 
by the Strong and Kuder inventories. Sch. 
and Soc., 1942, 55, 453-455. 

6. Strong, E. K., Jr. Vocational interests of men 
and women. Stanford: Stanford Univ. Press, 
1943. 

. Strong, E. K., Jr. 
est blank for women. 
Univ. Press, 1951. 

8. Thurstone, L. L. A multiple factor study of 
vocational interests. Personnel J., 1931, 10, 
198-205. 

9. Triggs, F. O. A study of the relation of Kuder 
Preference Record scores to various other 
measures. Educ. Psychol. Measmt., 1943, 3, 
341-354. 

10. Triggs, F. O. A further comparison of interest 
measurement by the Kuder Preference Record 
and the Strong Vocational Interest Blank for 
men. J. educ. Res., 1944, 37, 538-544. 

11. Triggs, F. O. A further comparison of interest 
measurement by the Kuder Preference Record 
and the Strong Vocational Interest Blank for 
women. J. educ. Res., 1944, 38, 193-200, 

12. Wiener, D. N. Empirical occupational groupings 
of Kuder Preference Record profiles. Educ. 
Psychol. Measmt., 1951, 11, 273-279. 

13. Wittenborn, J. R., Triggs, F. O. and Feder, D, D, 
A comparison of interest measurement by the 
Kuder Preference Record and the Strong Vo- 
cational Interest Blanks for men and women. 
Educ. Psychol. Measmt., 1943, 3, 239-257. 


uw 


Manual for vocational inter- 
Stanford: Stanford 


x 


Tue JOURNAL OF APPLIED PSYCHOLOGY 
Vol. 57, No. 4, 1953 


The Validity of the Mooney Problem Check List * 


Charles J. McIntyre 
The Pennsylvania State College 


The Problem Check List is an instrument 
developed by Mooney (1) to enable the 
teacher or counselor to quickly identify prob- 
lems or problem areas which concern his stu- 
dents. The high school form, which was used 
in this study, consists of 330 problems found 
to be of particular concern to students. They 
are classified into the following eleven major 
areas: (1) Health and Physical Development; 
(2) Finances, Living Conditions, and Employ- 
ment; (3) Social and Recreational Activities; 
(4) Courtship, Sex, and Marriage; (5) So- 
cial-Psychological Relations; (6) Personal- 
Psychological Relations; (7) Morals and Re- 
ligion; (8) Home and Family; (9) The 
Future: Vocational and Educational; (10) 
Adjustment to School Work; and (11) Cur- 
riculum and Teaching Procedures. 

Normally the subject is instructed to un- 
derline those problems that bother him and 
to circle those underlined problems which 
trouble him the most. In this study no dis- 
tinction was made between underlined and 
circled items. 

Mooney (2) has said that the nature of the 
Check List makes it impossible to arrive at a 
definitive conclusion about its validity. Va- 
lidity, he says, must be determined in terms 
of the particular purpose and the particular 
situation. While it probably is true that con- 
ventional measures of validity are difficult if 
not impossible to obtain for an instrument of 
this kind, it appears nevertheless that the 
Check List should meet at least three mini- 
mum requirements: (1) Students recognize 
their own problems; (2) They find these 
problems listed on the Check List; and (3) 
They are willing to record them. 

This study assumes that if these three con- 
ditions are met it should be possible to pre- 
dict the relative number of problems listed by 
particular groups of students in particular 


* This paper represents the substance of a thesis 
submitted in partial fulfillment of the requirements 
for the M.S. degree at The Pennsylvania State Col- 
lege. 


w 


areas. Hence the following hypotheses were 
formulated: (1) The less intelligent students 
would have more problems than the more 1m- 
telligent in the area of Adjustment to School 
Work; (2) Seniors would have more prob- 
lems than those in the lower grades in the 
area of The Future: Vocational and Educa- 
tional; (3) Students from broken homes 
would have more problems than those from 
intact homes in the area of Home and Family; 
(4) Boys would have more problems than 
girls in the area of Adjustment to School 
Work; (5) Boys would have more problems 
than girls in the area of The Future: Voca- 
tional and Educational; (6) Negroes would 
have more problems than whites in the area 
of Finances, Living Conditions, and Employ- 
ment; and (7) Girls would have more prob- 
lems than boys in the area of Courtship, S¢* 
and Marriage. 

The rationale behind each of the hypothe: 
ses should be evident. 


Procedure 


Subjects. The subjects were 407 
school students in grades ten to twelve 
clusive. The school which they attended 
the only public high school in a highly indus- 
trial Pennsylvania city with a population a 
approximately sixty thousand. The city pop" 
lation is highly heterogeneous in terms 
race, religion and national origins, 4 | this 


nd 
heterogeneity is reflected in the school pop” 
lation. 


high 
in- 
was 


fourth of the 
Check 


eroo™ 


Method. Approximately one- 
school population was sampled. The 
Lists were administered by the homer is 
teachers during the period prior to the ar 
class in the morning. Homerooms pari "in 
lected so that the proportion of students Je 
the several courses and classes in the ae 
would approximate the proportion of suie 
in these courses and classes in the €”. 
school population. In this way it was p°% 


e 
to secure a reasonably representative S 


+ 


j 
(- 
j 


-~ ri 


The Validity of the Mooney Problem Check List 271 
Table 1 
Groups and Problem Areas Relevant to Hypotheses 1 to 6 with N, Mean, SD and CR for Each 

Hypothesis Problem Area Group N Mean SD CR 

1. Adjustment to School Work Less intelligent 55 5.1 28 3.40 
More intelligent 61 2.9 12 : 

$, The Future: Seniors 5 35 07 3.39 
Vocational and Educational Sophomores 157 24 O04 n 

A Home and Family Broken Home 85 2.8 10 2.49 
Intact Home 318 1.9 02 s 

4. Adjustment to School Work Boys 202 4.6 .07 228 
Girls 204 3.8 05 j 

S The Future: Boys 202 3.3 .05 2.14 
Vocational and Educational Girls 204 2.6 04 Pe 

6, Finances, Living Conditions Negro + 100 3.5 .08 A 
White 295 2.8 03 = 


and Employment 


while retaining the administrative convenience 
of intact homerooms. 

In questionnaires of this kind the problem 
of a student’s honesty is a serious one, par- 
ticularly when there is a chance, as here, that 
his teachers may check his responses. As an 
example Olson (3), using the Woodworth- 
Mathews Personal Data Sheet, found that 
More symptoms were reported when the ques- 
tionnaire was left unsigned. Therefore in this 
Study a supplementary instruction sheet was 
attached to the Check List explaining that 
the study was being conducted to gather in- 
formation on the problems of high school stu- 
dents and instructing them not to sign the 
Check List. They were, however, to put their 
names on the instruction sheet and hand this 
in for an attendance record. A system of 

iscrete pinholes pricked through both the in- 
struction sheet and the Check List made it 
Possible to later match the two and identify 
the Check List. ; 

Information on the relevant varlab’e 
abstracted from the students’ records in the 
School file. Only two of these variables re- 
quire further definition: (1) “Less intelligent 
Students” are defined in this study as those 
Students whose Otis Gamma IQ's were more 
than one standard deviation below the mean 

of the sample. “More intelligent students 
are those whose IQ’s were more than a 
Standard deviation above the mean of the 


ables was 


sample; and (2) A student was classified as 
coming from a broken home if the records 
indicated that he was not presently living with 
both natural parents. 

Treatment of the Data. The mean num- 
ber of problems reported was computed for 
each of the variables and problem areas which 
was relevant with respect to the hypotheses. 
The hypotheses were tested by computing the 
critical ratio of the difference between these 
means. 

Results 


1. Hypotheses 1 and 2 were confirmed. In 
Table 1 it will be seen that the differences be- 
tween the mean number of problems reported 
in each case were significant at or beyond the 
01 level of confidence. 

2. Hypotheses 3, 4, 5, and 6 were con- 
firmed. In Table 1 it will be seen that the 
differences between means were significant at 
the .05 level of confidence. 

3. Hypothesis 7 was not confirmed. No sta- 
tistical difference between means was found. 


Summary and Conclusions 


The problem of determining the validity of 
the high school form of the Mooney Problem 
Check List was attacked by computing the 
mean number of problems checked in par- 
ticular problem areas by a group of high 


272 Charles J. 
school students who were classifiable into 
various discrete groups. 

This study was founded upon the assump- 
tion that the essential test of the validity of 
an instrument of this kind consists in deter- 
mining whether or not the students can recog- 
nize their own problems, find these problems 
represented on the Check List, and record 
them. If these three criteria are met the 
mean number of problems checked in par- 
ticular areas by various groups should differ 
significantly in a reasonable and predictable 
way. 

Hence, seven such differences were hy- 
pothesized on rational grounds. That is, be- 
cause of the sociological and psychological 
characteristics of particular groups, it was 
predicted that some groups would check more 
problems in certain areas than other groups, 


Mcintyre 


providing the three criteria of validity speci- 
fied above were met by the Check List. Of 
the seven differences hypothesized, six were 
found. 

It is concluded that these findings present 
prima facie evidence for the validity of the 
Check List. 

Received September 2, 1952. 


ey 


References 


1. Mooney, R. L. Exploratory research on stu- 
dents’ problems. J. educ. Res., 1943. 37, 218- 
224, 

2. Mooney, R. L. and Price, Mary. Manual to Ac- 
company Mooney’s Problem Check List-—High 
School Form. Columbus, Ohio: Ohio State 
University, 1948. 

3. Olson, W. C. The waiver of signatures in per- 
sonal data reports. J. appl. Psychol., 1936, 
20, 442-450. 


li 


THE JOURNAL or APP 5 j 
S Na APPLIED PsyCHoLoGY 
Val. 37, No. 4, 1953 


A Comparison of Manual and 


Fred. T. Tyler and 


College Norms for the MMPI 


John U. Michaelis 


School of Education, University of California, Berkeley, California 


Various investigators have suggested that 
the distributions of scores of college students 
on certain of the MMPI scales are different 
from those reported in the manual of direc- 
tions. For instance, McKinley (2) reported 
that college students obtained higher K-scores 
than did a less selected (intellectually and 
educationally) population. Similarly, college 
men have been found to score high on the 
masculinity-femininity scale (3), indicating a 

deviation of the basic interest pattern in the 
direction of the opposite sex” (1, p. 5) The 
MMPI has been extensively used in research 
and clinical investigations at the college level, 
So that it should be of interest to know some- 
thing of the extent to which T-scores obtained 
from the standardization group correspond to 
those based upon a college population. 

The MMPI (Booklet—short form) was ad- 
ministered to nearly one thousand juniors, 
Seniors and first-year graduate students in 
education courses in the School of Education 
at the University of California, either in Edu- 
Cation 110 (Educational Psychology) or in 
those courses required in the programs lead- 
ng to the General Elementary and General 


Secondary Teaching Credentials. T-scores 
were computed for each of the nine clinical 
scales? for men (N = 470) and women (N = 
571) separately. These samples are consid- 
erably larger (about 60 and 40 per cent re- 
spectively) than those of the original stand- 
ardization groups. It is recognized that these 
norms might not be representative of the gen- 
eral college population, since the majority of 
the subjects were expecting to enter the teach- 
ing profession. However, the college norms 
to be discussed here may be of interest to 
those who are administering this Inventory to 
college students, and especially to those con- 
cerned with personnel problems in ‘a teacher 
education program. 

It is the purpose of this note to present a 
brief comparison between the norms reported 
by Hathaway and McKinley (1) and by 
Tyler and Michaelis (6). Comparative data 
are presented in Table 1 for women and Ta- 
ble 2 for men. 

For the women, there seem to be some dif- 
ferences between the norms on three scales, 


1 The scores were not corrected by means of the 
K-score (2, 7). 


Table 1 


Raw Score Equivalents of Selected St 


andard Score Vé 
for Female Subjects 


Values Based on Manual and 


U. C. Norms 
Standard Scores 
70 60 50 40 30 
80 
Scales M* Ct M č a ç “m Cc mM € TEE 
aa Z “ 7 il 2 7 7 4 1 1 
D m 99 29 24 24 19 19 1 4 3 P 
= za 30 24 26 19 2 B i 
Pa 36 34 30 22 ie) 8 13 da 9 10 5 
Mt 26 26 a 29 31 33 36 37 u 42 o g 
a nt 1s 14 H 12 s 9 3 6 ay Od 
= a g 24 a 1 1310 ng et 
= oe 27 21 19 15 w o E: 
35 28 7 ee s 3 l 
Ma 3 21 B 2 190 19 TE: 


*R . d on norms 
z ts base 
aw score equivalen Paced on college norms (6). 


aw score equivalents 3 


in the Manual (1). 


73 


274 Fred. T. Tyler. and John U. Michaelis 
Table 2 
Raw Score Equivalents of Selected Standard Score Values Based on Manual and 
U. C. Norms for Male Subjects s 
Standard Scores 
80 70 60 50 40 30 
Scales M* ct M C M C M C€ M C E os 
Hs 18 14 13 10 9 7 4 4 0 0 
D 29 34 25 28 21 23 17 18 12 13 
Hy 33 32 27 28 22 24 16 20 11 16 
Pd 26 27 22 23 18 19 14 15 10 10 6 6 
Mf 36 45 30 39 25 33 20 27 15 21 w 18 
Pa 18 17 15 14 11 12 8 9 5 6 1 4 
Pt 31 30 24 23 17 16 10 9 2 1 
Sc 32 30 24 23 i 15 9 8 2 1 
Ma 7 28 3 24 18 20 14 16 9 12 g 8 


* See footnote to Table 1. 


Hs, Pt and Sc. Using the manual norms as 
the basis for interpretation, it is suggested 
that the college women were less concerned 
about their health, were freer from fears and 
lack of confidence, and from bizarre or un- 
usual thoughts (1, 4). On the remainder of 
the scales there was a high degree of similarity 
between the norms obtained from the two 
samples. 

Several comments about the two sets of 
norms are in order: 

1. The college men obtained relatively lower 
scores than did the standardization sample on 


90 


Manual or 


80 


70 


—f/-—o— — 


T-scores 


lege Norms 


60 


50 


Hs Hy Pd 


MMPI Scale 


Fic. 1. T-scores on the Manual norms of r 
the College 


aw Scores corres; 
norms (Men), 


only one scale, Hs. Accepting the usual in- 
terpretations of this scale, it appears that the 
college men were less concerned than were 
these members of the standardization gtoUP 
about the state of their health. 

2. The college men appeared to be some" 
what less depressed and more feminine po 
their interests when compared with the me™ 
bers of the original standardization groUP: 
For instance, on the D scale, a raw score : 
29 has a T-score equivalent of 80 0n the 
manual norms and of 71 on the college norm? 
Similarly on the Mf scale, a raw score of 


—o— — o — 
~co ae 


Mt Pa Pt 


t 


— ae 


A Comparison of Manual and College Norms for the MMPI 


90 


N 
~ 
wn 


80 
v 
2 _— College Norms 
S 70 ee HK 
9 — 
: Pa 
P A 

rs its Manual Norms 
60 
Cis D Hy Pd Mf Pa Pt Sc Ma 
MMPI Scale 

Fic, 2. T-scores on the Manual norms oi raw scores corresponding to T-scores of 70 on 


has T-values of 69 and 55 on the manual and 
college norms respectively. 

3. Raw scores on the remaining scales had 
very similar T-values on both sets of norms, 
especially at that part of the scale, the upper 
end, which is claimed to have clinical signifi- 
cance, 

Another method of comparing the two sets 
of norms was suggested. Raw scores corre- 
Sponding to T-scores of 70 on the college 
norms were selected as a basis of comparison. 


he T-score equivalents on the manual norms 


Were obtained for these same raw scores and 
are graphed in Figures 1 and 2. A iay aE 
Mc- 


of 70 was chosen because Hathaway anc 
Kinley consider it to be “a borderline score, 
although useful interpretation will always de- 
bend upon the clinician's experience with a 
Slven group” (1, p. 8). 

At the upper ends of t 
barent that discrepancies b 
Manual norms are found on 
Mf scales for men, and on Hs, 
Women, 

In general, the differen 
Sets of norms appear to 


he scales, it is ap- 
etween college and 
the Hs, D and 
Pt and Sc for 


ces between the two 
be relatively minor. 


the College norms (Women). 


However, multivariate analysis by means of 
Hotelling’s T (5) might reveal significant dif- 
ferences between the standardization and col- 
lege groups. 


Received October 29, 1952. 


References 


1. Hathaway, S. R. and McKinley, J. C. The Min- 
nesota Multiphasic Personality Inventory. New 
York: The Psychological Corporation, 1943. 

2. McKinley, J. C., Hathaway, S. R., and Meehl, 
P. E. The Minnesota Multiphasic Personality 
Inventory: VI. The K Scale. J. cons, Psy- 
chol., 1948, 12, 20-31. 

3. Nance, R. D. Masculinity-femininity in prospec- 
tive teachers. J. educ. Res., 1949, 42, 658- 
666. 

4. Tyler, F. T. A factorial analysis of fifteen MMPI 
scales. J. cons, Psychol., 1951, 15, 451-456. 

. Tyler, F. T. Some examples of multivariate analy- 
sis in educational and psychological research. 
Psychometrika, 1952, 17, 289-296, 

6. Tyler, F. T. and Michaelis, J. U. University-stu- 
dent norms for the Minnesota Multiphasic Per- 
sonality Inventory. Mimeographed. Avail- 
able on request to the writers. 

. Tyler, F. T. and Michaelis, J. U. K-scores ap- 
plied to MMPI scales for college women. (To 
appear in Educ. psychol. Measmt.) 


in 


THE JOURNAL OF APPLIED PSYCHOLOGY 
Vol. 37, No. 4, 1953 


Socio-economic Status and Culturally-weighted Test 
Scores of Negro Subjects 


Frank C. J. McGurk 
Lehigh University 


A previous article (6) reported on the non- 
cultural and cultural test scores of 213 pairs 
of white and Negro high school seniors who 
had been matched for age, school attendance, 
school curriculum, and eleven selected socio- 
economic factors. All subjects were between 
the ages of sixteen and twenty; the mean age 
for the whites was 18.1, and the mean age for 
the Negroes was 18.2. The subjects were ob- 
tained from schools in southeastern Pennsyl- 
vania and northern New Jersey. 

Socio-economic status was defined in terms 
of the score obtained on a revision of the Sims 
Record Card. The high Negro socio-economic 
group is composed of those Negro subjects 
whose revised Sims scores were in the highest 
25% of the range of Negro Sims scores. The 
low Negro socio-economic group is composed 
of those Negro subjects whose revised Sims 
scores were in the lowest 25% of the range of 
Negro Sims scores. 

Test questions were defined as non-cultural 
and cultural according to the pooled judg- 
ments of 78 school teachers, psychologists, 
and sociologists. Non-cultural questions are 
those which the judges considered least cul- 
turally-weighted; cultural questions are those 
considered heavily weighted with cultural ex- 
periences. 

Complete details on the selection and match- 
ing of subjects, the revision of the Sims Rec- 
ord Card and the determination of socio-eco- 
nomic status, and the dichotomizing of the 
questions can be found in the previous article 
(6). 

This paper is concerned with the problem 
of how the socio-economic status of Negro 
subjects affects their differential test perform- 
ance on non-culturally-weighted and cultur- 
ally-weighted test questions. 


Results 


Table 1 shows the mean non-c 


ultural 
cultural scores of the high and low TE 


276 


nomic groups. Since the standard deviations 
of the non-cultural and cultural scores are al- 
most identical (4.62 for the non-cultural and 
4.64 for the cultural) the findings will be pre- 
sented in terms of raw scores. 

The difference in mean non-cultural score 
that is associated with a difference in socio- 
economic level (the H — L difference) is 2.22; 
the H—L difference for the cultural questions 
is 1.21. Thus, in the comparison between the 


Table 1 


Negro Non-cultural and Cultural Mean Raw 
Scores by Socio-economic Groups 


=e = 
Socio-economic 
Group* 
Type of Difference 
Question High Low (H-L) 
Non-cultural 13.62 11.40 2.22 
Cultural 9.81 8.60 1.21 
Net change 1.01 
SE of net change 0.89 
t 1.24 
P 20% approx 


* N for each group = 53, 


highest and lowest socio-economic group of 
Negroes, a greater difference is obtained o 
the non-cultural, not on the cultural, u 
tions. The difference between the two H ~ 


differences is significant only at the 20% 
level. 


Discussion A 
In his study of the effects of length of = 
dence in New York City on Negro test score 
Klineberg found that, as the length of te 
dence increased, test scores increased. 
also found that his results “. . . are ™ 
clearer for the linguistic tests than for 
performance tests” (5, p. 44). ic OF 
Various writers have described linguist! jo 
verbal tests as sensitive to differences in 50 


uch 
the 


Socio-economic Status and Culturally-weighted Test Scores of Negroes Di 


economic status because of the culturally- 
Weighted content of such tests (2, 3, 4). 
Others have described performance or non- 
verbal tests as not being as sensitive to socio- 
€conomic level because the content of these 
tests is not so culturally-weighted (1, 3, 5). 
Hence, Klineberg’s findings have been inter- 
preted to mean that increasing the socio-eco- 
nomic status of the Negro should be ac- 
Companied by a greater improvement on cul- 
turally-weighted material than on the less 
culturally-weighted material. 

he present findings do not support such 
an interpretation. The present data show 
that the test superiority of the Negro of high 
Socio-economic status over the Negro of low 
Socio-economic status is associated more with 
a superior performance on the non-cultural 
Questions than on the cultural questions. 


Received October 6, 1952. 


N 


References ? 


. Alpers, T. G. and Boring, E. G. Intelligence test 


scores of northern and southern white and 
Negro recruits in 1918. J. abnorm. soc. Psy- 
chol., 1944, 39, 471-474. 


. Bean, K. L. Negro responses to certain intelli- 


gence test items. J. Psychol., 1941, 12, 191- 
198. 


. Bean, K. L. Negro responses to verbal and non- 


verbal test material. J. Psychol, 1942, 13, 
343-353. 


. Brown, F. An experimental and critical study of 


the intelligence of Negro and white kinder- 
garten children. J. genet. Psychol., 1944, 65, 
161-175. 


. Klineberg, O. Tests of Negro intelligence. In 


Otto Klineberg (Ed.), Characteristics of the 
American Negro, New York: Harper & Bros., 
1944, 


. McGurk, F. C. J. Comparison of the perform- 


ance of Negro and white high school seniors 
on cultural and non-cultural psychological test 
questions. Washington: The Catholic Univer- 
sity of America Press, 1951 (microcard). 


THE JOURNAL oF APPLIED PSYCHOLOGY 
Vol. 37, No. 4, 1953 


The Relationship Between Rater Characteristics 
and Validity of Ratings 


Dorothy E. Schneider and A. G. Bayroff ' 


Personnel Research Section, Personnel 
Department of the A 


The Personnel Research Section, Personnel 
Research and Procedures Branch, TAGO has 
been conducting a series of studies aimed at 
identifying personal characteristics of raters 
which are associated with more valid ratings. 
The problem is of particular significance to 
this office which is responsible for the devel- 
opment of efficiency reports for Army-wide 
use. It is also important because the absence 
of more objective criteria makes it necessary 
to use multiple ratings in the validation of 
other types of instruments. If personal char- 
acteristics of the rater which are related to 
the validity of ratings he gives could be found, 
it might be possible to control them, perhaps 
in the selection of raters, or in the form of a 
“correction score.” While this may not be 
feasible on official efficiency ratings, it could 
profitably be done when obtaining ratings for 
use as criteria. Further, it may be possible to 
develop new techniques which would be more 
independent of the personal characteristics of 
the rater. 


Problem 


The study reported here was concerned with 
the validity of ratings by raters differing in 
three characteristics: aptitude test score, aca- 
demic achievement, and rated over-all value 
to the Army. 


Method 


Subjects. The population consisted of 400 
officers (primarily majors and lieutenant colo- 
nels) enrolled as students at the Army Com- 
mand and General Staff College. The objec- 
tive of this college is to train potential divi- 
sion commanders and general staff officers 
and its students represent a highly selected 
group. The course was 42 weeks long and 


1 The authors wish to acknowledge ti i 
of various members of the staff, a a 
Rundquist, Mr. A. H. Birnbaum, and Mr. Joel T. 
Campbell. The opinions expressed in this paper are 
those of the authors and do not necessarily refi t 
those of the Department of the Army. ° ee 


Research and Procedures Branch, TAGO, 
rmy, Washington, D. C. 


the students were in close contact with each 
other during the entire period. 

The officers were assembled in groups af 
33-40, the size of the classes, and asked to 
rate their class associates. In this manner 
each officer served as both rater and ratee. 

Design of the Study. This report covers 
one aspect of a larger research program A 
rating methodology. Only those aspects Q 
the design pertinent to the present study W! 
be presented here. 

The Criterion. The criterion measure was 
an appraisal of over-all value to the ae 
The total population was randomly split 1 
two groups, A and B, each consisting of he 
officers. Officers in Group A provided bi 
criterion rankings for all officers in the POP 
lation. Each member of this group WaS a 
quired to evaluate his class associates by oe 
ing them in the highest, middle, Or low? 
third of the class. Each rater evaluate m 
ratees from Group A and 20 ratees on 
Group B; thus each member of the total an 
lation was evaluated by 20 raters. ings: 
terion score was the mean of the 20 ranking,” 

Anonymous Rating Scale. This was pa 
point rating scale on which the officers i 
rated on over-all value to the Army. “ 
of the scale units was described in som® 
tail. Group A completed 20 such rating” i 
on officers in Group A, and 10 on offcer re 
Group B. Each of the 400 officers th® pd 
ceived 10 ratings with this instrument, the 
the mean of these ratings was use 
score. The raters were informed tha ra 
were to be held in confidence, that the T 
ings would not be made available to an not 
lege or the Army, and that the raters 1e? re 
sign their names. These ratings wil ings 
ferred to hereafter as the unsigned an wae 

Identified Official Rating Scale. TH pet! 
the same 8-point rating scale just dese tind 


It was used by Group B, each office" 
278 


a 
t thes? 


Relation Between Rater Characteristics and Validity of Ratings 


10 ratees from Group A and 10 ratees from 
Group B. Each of the 400 officers thus re- 
ceived 10 such ratings, and the mean rating 
served as his score. The raters were informed 
that these ratings would be available for offi- 
cial use for one year, and they were required 
. to sign the ratings. These ratings will be re- 
ferred to hereafter as the signed ratings. 
Forced Choice Pairs. The forced choice 
technique has been used in Army officer effi- 
ciency reports from July 1947 to September 
1950. Tts use in efficiency reports was dis- 
continued largely because the form of forced 
choice used was not acceptable. For an ex- 
tended discussion, see Baier.* Research has 
continued, nevertheless, with the objective of 
developing more acceptable forms. One of 
these forms contained the grouping of phrases 
in pairs instead of tetrads as formerly used. 
The phrases used were taken directly from 
the former Efficiency Report, Form 67-1. 
Wenty-four pairs were employed in this 
study, members of each apparently equally 
favorable, but differing in their ability to dis- 
criminate better from poor officers as deter- 
mined by their criterion scores. The rater 
checked the phrase of each pair which he felt 
was More Descriptive of the ratee. Scoring 
was based on the discrimination values of the 
Phrases checked, One such rating form was 
completed for each officer in the population 
by the raters of Group B. 
Controlled Check List. The controlled 
check list was the second form of the forced 
choice technique included in this study. The 
Phrases used in the 24 forced choice pairs 
Were grouped into two sets of 24 phrases each. 
he rater selected the 12 phrases of each set 
Which he felt were most descriptive of the 
Tatee and scoring was based on the discrimi- 


nation values of the phrases checked. One 


controlled check list was rendered on each 
e raters of 


Officer in the population by th 
Toup A. 
Rater Variables. Raters were divided into 

equal thirds, highest, middle, and lowest, on 

the basis of scores on each of the variables 


described below. 


? Bai pots itical re- 
Baier, D. E. Reply to Travers A critica F 
new of ‘the validity’ and rationale of ihe tored 
choice technique.” Psychol. Bull., 1951, 48, > 


279 


Aptitude. Scores on the Officer Classifica- 
tion Test (OCT), a high level aptitude test, 
were available for all officers. 

Final Class Standing. Officer students are 
ranked on academic work periodically during 
their attendance. The last ranking is the 
Final Class Standing (FCS). Prior to this 
study, class standing and the other variables 
were transferred to IBM cards in such man- 
ner that the identification of individual offi- 
cers was no longer possible. 

Derived Final Class Standing. To remove 
the contribution of aptitude from Final Class 
Standing, predicted FCS was obtained from 
the regression of FCS on OCT. The differ- 
ence between actual and predicted FCS of 
each officer was used as another measure of 
achievement referred to here as the Derived 
Final Class Standing (Derived FCS). 

Over-all Value. This score was the mean 
of the 20 criterion rankings received by each 
rater. 

After the raters were divided into thirds, 
the validity of ratings by each of these thirds 
on each of the rating variables was obtained. 
For the 8-point scales, multiple ratings per 
ratee were available and validities were ob- 
tained for both single ratings and mean of 
ratings given by each rater. For the two 
forced choice forms, validities were obtained 
for single ratings only. 


Results 


The validities of ratings given by raters of 
each of the levels of rater characteristics 
studied are presented in Table 1. 

In general, a direct relationship existed be- 
tween the validity of the ratings given and 
measures of the raters’ over-all value, apti- 
tude, and achievement. Except in a few in- 
stances, validity coefficients for the highest 
thirds were higher than those for the middle 
thirds and those for the middle thirds were 
higher than the validity coefficients for the 
lowest thirds. In no case did ratings by raters 
in the lowest thirds have higher validities than 
ratings by raters in the highest thirds. 

It will be noted that although the validities 
of ratings with the 8-point scale were slightly 
higher than those with the other types of rat- 
ings, the lower validity by poorer officers was 


280 Dorothy E. Schneider and A. G. Bayroff 


Table 1 
Validity of Ratings by Raters of Different Levels of Ability and Achievement 


Validity of Ratings on 


Controlled 


Unsigned Signed Forced | controne’ 
Raters Divided 8-Point Rating 8-Point Rating Choice Pairs Check List 
into Highest n ees 
“Middle, and Single Mean Single Mean Single Ls oe 
Lowest Thirds on Ratings Ratings Ratings Ratings Ratings A ating: 
Aptitude H 57 74 50 72 AG K 
(OCT) M 50 71 51 66 AS T 
„51 69 48 00 30 A 
Final Class H 58 74 4 73 50 es 
Standing M 54 71 9 65 Bt ge 
(FCS) L AT .60 4 Eri 40 wt 
Derived Final H 38 74 4 2 49 a 
Class Standing M S5 75 50 61 36 pe 
(Derived FCS) L 46 64 42 62 35 42 
Criterion H 61 79 55 69 49 -48 
Ranking M 48 66 51 66 30 ei 
i 49 66 45 65 43 ee 


present in all types of scales. Of the two 8- 
point rating scales, the validities were some- 
what higher for the unsigned than for the 
signed. However, Group A, which rendered 
the unsigned ratings, also performed the cri- 
terion ranking, so some degree of rater con- 
tamination may be present. 

Both forced choice forms had greater de- 
creases in validity from the highest third to 
the middle third of the rater groups than did 
the 8-point scales. There were three decreases 
as large as .19 for the forced choice forms, 
and the only comparably large decrease for 
the 8-point scales was .13. It thus appeared 
that raters of lesser ability, as defined in this 
study, produced more accurate ratings with 
the conventional technique than with forced 
choice or controlled check lists, 


Summary 


Officers at the Army Command and Gen- 
eral Staff College rated each other using four 


techniques: two 8-point scales of over-all 
value (one signed by the rater and the aa 
unsigned), and two forms of the forced choi 
technique (forced choice pairs and a me 
trolled check list). Rater groups were | r 
vided into highest, middle, and lowest puras 
on the basis of aptitude test score, final ae 
standing, final class standing predicted ion 
aptitude test score, and on the criterion rant 
of over-all value achieved. For each third ri 
the groups, separate validity estimates oi oer 
ings made were computed for both individu 
ratings and mean ratings where available. h 

It was found that raters who scored Þig 
on aptitude, achievement at the College; a 
over-all value to the Army produced aie 
valid ratings than did raters who scored lo mae 
on these variables. This trend was Þig ne 
consistent for the 8-point rating scales, ae 
clear, though not as direct, for forced cho! 
pairs and the controlled check list. 


Received October 15, 1952. 


mic ken 
ndecided (“Don’t know”). 


we Jourxat or Appiiny PSYCHOLOGY 
ol. 37, No. 4. 1953 


The Actuality Measure in the Study of Public Opinion 


Peter R. Hofstaetter 
The Catholic University of America 


When he first suggested an “actuality- 
Measure” (1) this author (7, 8, 9) aimed at 
Combining two aspects of rather strongly ego- 
volved states of public opinion. He hy- 
Pothesized that a population of respondents 
shows relatively great ego-involvement in a 
question if two conditions are fulfilled: (a) 
Few “don’t know” responses; and (b) An 
even split between the Yes- and No-re- 
Sponses.! 

The formula which purports to measure the 
actuality of a given question for a given popu- 
lation of respondents was based on these two 
Notions: 


The evidence this writer has meanwhile col- 
lected seems to confirm the assumptions un- 
derlying this formula (10). 

In this paper we shall first try to summarize 
the available evidence for the designation of 
4 as a measure of “actuality” or of the in- 
Volvement of a group in a topic. This is sup- 
Posed to furnish an empirical check of the 
Validity of our measure.? Our second attempt 
will then be to present a statistical model from 
Which the A-measure can be derived and 
Which allows us to account for the empirical 
Properties of A. 

After having rather regularly analyzed poll- 
"eports during the last few years at least five 
Conditions became clear which tend to result 
into relatively high actuality of questions. 

ach statement will be illustrated by one ex- 
ample; additional examples could easily be 
tained. 
` We shall concern ourselves only with CU 


that y, 1.€., 
Aa r trichotomy, 1.€., 
the „8° answered in terms of a gative (“No”) or 


ittemati “Yes”), in the ne 
Dost know"). The respective percent- 
n each of these cate- 


and po. 


RI Pi 
Borie, the responses falling i 
2 4 Are referred to as p+, $- A 2 
tem etter check would obviously consist orea 
~ pt to Predict the relative va ot D a 
tion, Which had been answered by differen Pop a 
Teport ork in this direction E peep yl 
usin, yond the s 
ent paper, would go beyo. 


281 


1. The more pertinent an issue is felt to 
be the higher is its actuality for a particular 
group of respondents. See Table 1. 


Table 1 


Distribution of Answers to the Question: “Do you 
think cigarette smoking is harmful or not?”t 


Not 
Harm- Don’t Harm- 
ful know ful 


Respondents % % % A 
Cigarette smokers 52 3 45 16.12 
Non-cigarette smokers 66 10 24 3.98 


+ ATPO., Dec., 1949. 


2. The more imminent the event is to which 
the question refers the higher is the actuality 
of the question. See Table 2. 

3. The greater the change involved in the 
question the higher is the actuality. See Ta- 
ble 3. 

4. The richer the available background of 
experience the higher is the actuality of the 
question (Saenger and Gordon, 19). See Ta- 
ble 4. 

5. The higher the educational and/or the 
socio-economic level of the respondents the 
higher actualities questions tend to attain. 
The data in Table 5 are based on 14 ques- 
tions reported by the Psychological Corpora- 
tion’s Barometer (Link, 13, 14, 15, 16). 

One may infer from these data that quite 
a few of the usual poll questions are tuned to 
the mentality of the above-average strata of 
our population. Thus they often reflect the 
problems of the “intelligentsia” rather than 
those faced by the “man on the street.” 
There are, however, interesting exceptions to 
the general trend (Link, 13). See Table 6. 

Cases like the one just mentioned can prob- 
ably be understood in terms of point one of 
the present list, i.e., with reference to the 
greater pertinence of the question to the 
lower socio-economic levels. 


282 Peter R. Hofstaetter 
Table 2 p 
Distribution of Answers to the Question: “Do you think the U. S. will find itself in another 
world war within, say, the next year? (five years?) 
Within One Year Within Five Years 
Date of Yes = NoOp. No Yes No Op. No A 
Interview % % % A % % % : 
May 1950 22 8 70 4.91 57 19 24 ne 
Sept. 1950 29 16 55 2.50 58 22 20 ay 
July 1951 26 10 64 4.08 56 26 18 ya 
Dec. 1952 2i 12 67 3.13 48 27 25 1.28 
+ ATPO. 
Table 3 
Distribution of Answers to the Question: “Would you approve or disapprove of the following 
changes of postage which have been suggested ?”} 
á Per Cent Approve No Op. Disapprove 
Change of Postage Increase o % 0 8 
Postcards from 1 ct. to 2 cts. 100 52 7 41 6.59 
Reg. mail from 3 cts. to 4 cts. 33 33 8 59 5.51 
Air mail from 6 cts. tò 7 cts. 17 51 9 40 5.02 


+ AIPO.. Aug., 1949. 


Taken together, these examples seem to 
suggest the empirical validity of the actuality- 
measure. They represent, however, only a 
small sample from the total available evi- 
dence. 


While the empirical validity of the actu- 
ality-measure seems to stand on relatively 
firm ground the arbitrariness of its derivation 
caused the present writer considerable head- 
ache. It will be the task of the present dis- 
cussion to remedy this shortcoming. 

Let us assume, for the sake of a model, 
that our respondents were to give their an- 
swers on the basis of two balls drawn from 


Table 4 


Distribution of Answers to the Question: “Do you 
believe in the efficiency of the New York 
State law against discrimination?” 


Don’t 
Yes Know No 
Respondents % VA A A 
Persons who had experi- E 
enced job discrimination 15 14 71 2.33 
Others 27 21 52 1.78 


two urns. Each of these urns contains ; 
per cent white and “b” per cent black b 
The following combinations can thus occu 
two white balls, one white and one black ba": 
two black balls. According to the rules 
our game the respondents will have to ren 0 
a “Yes-statement” whenever they draW H 
white balls, and a “No-statement” if eo 
balls are black. In case they draw one bat 
and one black ball the answer is going we 
“Undecided,” or “Don’t know,” “Uncerta™ 
“No opinion,” etc. ese 
The frequencies with which we expect tn 


Table 5 


Actuality as a Function of the Socio-ec 
Level of the Respondents 


nome 


-== y; 
Averag 
, (%) 100) 
Socio-economic Level a as See 
A (highest 10 per cent) 4.91 i 
B (next 30 per cent) 4.04 100 
C (next 40 per cent) 3.28 ól 
D (lowest 20 per cent) 2.03 
Total = 


3.32 


j 


a eee 


The Actuality Measure in the Study of Public Opinion 283 


Table 6 


Distribution of Answers to the Question : “Have 
businessmen a right to shut down?” 


Socio- Un- 
economic Yes certain No A 

Level % % % A A 
A 60 8 32 5.48 83 
B 57 7 36 6.48 98 
c 46 7 47 6.64 100 
D 41 6 53 7.78 117 

Total 49.5 8 43.5 6.63 100 


responses to occur will follow the binomial 
expansion: 


(a + b)? = a? (Yes) + 2ab : 
(Don’t know) + b” (No). 


The middle term in this expansion becomes 

thus: 
2ab = 2\/a"b? = po. 

We replace now a? and b* by the observed 
Percentages of the Yes- and No-responses (p: 
and p_ respectively) and thus obtain for the 
middle term: 

bo = 2V bb- 
_ For all binomial distributions the follow- 
Ing relationship holds: 


P= 2V ps P- = 100; 
po 
with an easy way to ascer- 
t an empirical distribution 
We have to re- 
r «Bernoullian”) 


P differs signifi- 


This provides us 
tain whether or no 
follows the binomial pattern. 
ject the simple binomial (0 
model whenever the ratio 
cantly from 1.00. 

The magnitude P becomes thus a meae 
for the applicability of the Bernoullian mo! n 
to an observed distribution of ER [ 
can, however, be shown that our actua i 
Measure (4) serves this very same purp 


equally well: 


Pe tbb- =44?; consequently: A =4P. 
bo 
Since P equals 1.00 fo 


tributions A ily equals 0.50. P 
necessarily €q 7 
It thus becomes clear that our actuality 


r all binomial dis- 


Table 7 


Distribution of Answers to the Question : “Do you 
think that Socialism in England will 
succeed or fail?” 


Socio- Suc- Don’t 
economic ceed Know Fail A 
Level % % % A % 
A 19 24 57 1,37 236 
B 13 31 56 0.87 150 
Cc 10 42 48 0.52 89 
D 11 60 29 0.30 52 
Total 12 41 47 0.58 100 


measure is nothing but a quantitative expres- 
sion for the appropriateness of the Bernoul- 
lian model for the representation of an em- 
pirical distribution of responses. It should be 
noted that this model does not presuppose 
any specific proportionality between the two 
initial probabilities “a” and “b.” The ap- 
pearance of a rather substantial majority for 
either “Yes” or “No” is therefore compatible 
with the Bernoullian model. 

The examples given in the first part of this 
paper have shown rather consistently actuali- 
ties higher than 0.50. Indeed, it is not easy 
to locate questions which yield actualities 
that low or lower. An example may be 
drawn, however, from one of Link’s polls 
(14). See Table 7. 

The response distribution of Group C (A 
= 0.52) comes indeed very close to the ex- 
pansion of (0.30 + 0.70)* = 9% (Succeed) + 
42% (Don’t know) + 49% (Fail). 

Since poll data conform only rarely to this 
pattern we have to recognize that the Ber- 
noullian model is too simple to account for 
them. But we know already that its failure 
in that respect can be gauged from either 
Por A. 

The next step we have to undertake is to 
replace the model of two urns with the same 
probabilities (“a” and “b”) in each by the 
model of two urns with different probabilities. 
Obviously, our first model can be considered 
as a special case of this second, more general 
model. 

Let us assume that the respondents base 
their decisions on the drawing of one ball 
from each of two urns where the following 


284 Peter R. Hofstactter 


Table 8 


The Markov Model of a Response Distribution 


Urn Probabilities A 
“Tm =0.50 b, =0.50 
I a:=0.70 b:=0.30 
Ul a;=0.10 —b;=0.90 
Responses 
% p.=35 p=20 p_=45 1.98 


probabilities prevail: 


a; += a» and b; Æ dy. 


The corresponding percentages of the three 
categories of answers are: 


Pp, = i'a 
Po = (a'b) + (a9°b;) 
p_ = br be 


This model always results in distributions 
which have actualities of less than 0.50. Tts 
usefulness is therefore drastically limited, 

This will no longer be the case with the 
third model which employs the statistical 
theory of the “Markov Chains” (5). Let us 
assume that there are three urns with differ- 
ent probabilities for white and black balls: 


a >a >a; and bs < bi < ba. 


Whenever a subject has drawn a white ball 
from Urn I he is bound to draw his second 
ball from Urn II where the probability of a 
white ball is even greater than in the first urn. 
In the event of a black ball being drawn from 
the first urn the subject is required to draw 
the second ball from Urn II which gives him 
again an even higher probability to draw a 
black ball than the first urn did. This model 
abandons the independence condition of the 
Bernoullian model. A numerical example 
may help to clarify this notion. See Table 8. 

The essential feature of this model is that 
the probabilities which prevail for the draw- 
ing of the second ball depend on the result of 
the first drawing. Such probabilities have 
been called “dependent” or “conditional” 
probabilities. It can be seen easily that the 
Markov model includes the Bernoullian model 


as a special case of 4, = &> = az. The out- 


come of a so-called Markov process is de- 
fined by the following equations: 


b. = aia 
po = (m1*ba) + (biaa) 
p. = bbs 


Though it may seem as if we had three 
equations for the determination of the three 
unknown variables (a), a» and az) we will not 
be able to solve these equations since only two 
of the p-values are independent. The prob- 
lem is therefore bound to remain underde- 
termined unless we introduce an arbitrary 
assumption with respect to the initial prob- 
ability a}. Our assumption may, for instance, 
be: a; = ,=0.50." Consequently, the fol- 
lowing equations will define the conditional 
probabilities: 


1 
a2=—p,.=2p,; 


a 


iam aaah =2(p,+ po) —1.00. 
1 


Since the observational data we collect by 
means of poll questions do not allow us to de- 
termine all three probabilities needed for the 
Markov model its usefulness may seem tO be 
Sort of academic. Yet, it should be pointed 
out that this model shows one property whic? 
furthers our understanding of the processes 
which underlie the formation of public op!” 
ion. In order to become aware of this aspect 
we have only to replace the term “urn” 
the term “source of information.” The 
Markov model tells us that the conclusion 
our respondents draw from one source d T 
formation (these conclusions may be either 
favorable or unfavorable with respect to p 
question at issue) determine their exposure t° 
other sources of information (some of them 
being more and others less likely to confirm 
a favorable attitude). What accounts a 
mately for an observed response distributio” 
is thus a certain amount of selectivity Wi" 
respect to the sources of information to we 
the Tespondents expose themselves. : a 
this boils down to is that the drawing fran 
different sources of information does not on 
cur in a statistically independent manner- 


2 Th; ase 
? This assumpt 


. Ci 
ion b tenable in the 
of: hth < 50. ecomes unten: 


The Actuality Measure in the Study of Public Opinion 


the contrary, the sources of information which 
our respondents actually recognize are in most 
Cases intercorrelated. 

Once the first choice has occurred the sub- 
sequent choices tend to be made on the basis 
of sources of information which are more 
likely to support the initial choice than to 
contradict it. Thus a series of choices gets 
underway which assures the individual sub- 
ject of a relatively stable and un-ambivalent 
attitude. By the same token we may infer 
that neutral responses (“Don’t know,” etc.) 
tend to vanish in the group of respondents as 
a whole. What we have called “actuality” 
and have operationally defined by our for- 
mula thus serves as an indicator for the 
amount of “bundling” of sources of informa- 
tion which prevails in a given group of re- 
Spondents with respect to a given issue. 


Table 9 


The Markov Model of Two Correlated Sources 


= Before Debate After Debate 


bs po p. A by b b A 
36 46 I8 36 18 46 2.26 
i Urn H 
Urn I as b Sum 
E se F 
bi 9 46 55 
Sum 45 55 100 
E 7 ry = 0.83 


In order to explore this last point further 


We can restate the Markov model in the fol- 
lowing way: We assume two sources of in- 
formation (“urns”) with the same probabili- 
ties for either positive or negative statements 
in each of them: a; = 42; b1 = 52 We can 
thus set up a fourfold table in which all sorts 
Of trichotomic response distributions can be 
arranged as is shown in Table 9." Í 
The data of Table 9 are taken from an ex- 
Þeriment by Millson (17) in which the A 
or arranged for group discussions On 
topic of unemployment insurance. ne 
and again after debate the participants we 


4 5 ients (rr) are 
*The tetrachoric correlation Poiana (3). 


read from the well-known Thurstone 


285 


asked to state their opinions. As can be seen 
from Table 9 the debate resulted in a height- 
ened actuality of the question. This is a 
fairly common event (10). In the lower 
half of Table 9 the same data have been 
brought in the form of fourfold tables in 
order to allow for the computation of tetra- 
choric correlations. The value of pọ had thus 
to be divided into two cells, i.e:, the combina- 
tion of a, with b» and of as with bı. One half 
of py was allotted to either cell. 

The interpretation that can be given to 
these data reads: Whereas the respondents 
used the sources of information available to 
them before the debate in an (almost) in- 
dependent manner a considerable degree of 
bundling or patterning occurred after the de- 
bate. The rise in actuality (from 0.55 to 
2.26) corresponds with an increase in the 
degree of bundling (from 7 = 0.08 to = 
0.83). In fact, there exists, in general; a defi- 
nite relationship between A and the corre- 
sponding coefficient of correlation rs See 
Figure 1.° 

Actualities below 0.50 correspond to nega- 
tive correlations. This means that the re- 
spondents who have drawn their conclusions 
from the first source of information in a posi- 
tive manner are more likely to draw a nega- 
tive conclusion from the second source they 
consult, and vice versa. This is seldom the 
case. An example can be seen in the behav- 
ior of socio-economic group D in Table 7. 
In introspective terms this may be either con- 
sidered as “disappointment” or as a striving 
for objectivity. 

As we reach the end of our present investi- 


5 An empirical equation has been fitted to this 


rve: 
a re = 0.96 — 1.78 4 
For A-values over 0.50 the agreement between ‘ex- 
pected and observed r;-values is excellent, below 0.50 
it is tolerable. It had been hoped that this relation- 
ship would provide us with a test of significance for 
A. This hope may, however, not materialize be- 
cause of the great difficulties that enter into the de- 
termination of the significance of differences between 
tetrachoric coefficients. In the future a significance 
test for A in terms of the corresponding Chi-squares 
seems to be more likely. It can, however, be reason- 
ably inferred from Figure 1 that relatively small 
numerical differences between A-values in the range 
from 0.50 through 3.50 are much more likely to be 
significant than numerically larger differences be- 
tween higher A-values. 


286 Peter R. Hofstaetter 


— —___ ee —— — 
| 2 3 4 5 6 
A 


Fic. 1. The relationship between actuality (A) and 
the bundling of sources of information (rt). 


gation it may be said that the actuality-meas- 
ure indicates the amount of discrepancy that 
exists between an observed response distribu- 
tion and the kind of distribution that one 
would expect on the basis of the Bernoullian 
model, This discrepancy can be expressed in 
terms of the correlated functioning of two 
sources of information (Markov model), 
Needless to say, no model can ever be taken 
literally. It is quite artificial to argue that 
always two and just two sources enter into 
the determination of a person’s attitude to- 
wards a given social issue. Indeed, there is 
little doubt that we are exposed to many 
more sources of information with respect to 
most issues.’ It seems, however, that the 
more sources of information the ordinary per- 
son, in accordance with his socio-economic 
and/or educational level, consults, the 


6 The “interference-hypothesis” which asserts 
any communication which successfully modifies 
son’s beliefs will reduce the opinion-impact of any 
subsequent event or communication that tends to 
produce antithetical beliefs” has been recently con- 
firmed (11). 


“that 
a per- 


stronger becomes his chance to hit upon cor- 
related sources. This is in line with point 
five in the first part of this paper. The other 
four points previously made can now be sum- 
marized by saying that the more important a 
decision we have to make, the more prone we 
are to expose ourselves to correlated sources 
of information. Yet, this statement requires 
at least one qualification. As has been shown 
in an earlier paper (9) there exists a negative 
correlation between the actuality and the ex- 
perienced difficulty of questions. Our subjec- 
tive experience that a question is difficult 
seems to correspond with an un-bundling of 
sources of information. Ultimately this causes 
us to shift in the direction of the Bernoullian 
model. 

To recognize fully the eventually large 
number of sources of information would, of 
course, complicate our derivations. The basic 
postulate, however, will probably remain un- 
altered, that is, that one can either use the 
multitude of sources of information in a com- 
plete and unselected manner or use them se- 
lectively. From a theoretical point of vew 
the first procedure is undoubtedly preferable; 
in practice the latter seems to be the rule. TO 
follow this “rule” minimizes the ambiguity © 
perceived situations and thus eases the burden 
of uncertainty.” Ceteris paribus this tendency 
may be expected to show up more strongly 
the more deeply ego-involyed we are in cet 
tain issues. Actuality measures the strength 
of this tendency. 


Summary 


_1. Poll questions which can be answered bY 
either “Yes,” “No” or “Don’t know” yield re- 
sponse distributions which can be compare 


7 This very same mechanism has been described A 
several situations by different terms, For instance 
the “selectivity” of learning and forgetting (5: 
or as “intolerance of ambiguity” (6) with regare | 
Perception. The relationship between “intolerance sd 
ambiguity” and ethnocentrism has been demonstra i- 
a). comparable tendency to escape from Hara 
guity seems to underlie the “Halo-Effect” (Tho iise 
dike) in personality judgment. The correlated ‘yy 
of indicators refers also to this situation (2): ged 

erners (20) concept of “rigidity” is also ba 0 
upon the notion of either too much isolation oF al- 
much overlapping of subareas, i.e., upon their Pon 
tive or negative correlation. The “hypothesis-C it- 
firmation theory” of Postman (18) provides 4 s 
able frame of reference for these phenomena. 


, 
to 


The Actuality Measure in the Study of Public Opinion 


with binomial expansions. 


2. The actuality-measure indicates the dis- 


crepancy between an observed response dis- 
tribution and that expected from the binomial 
model, 


3. All kinds of trichotomic response dis- 


tributions can be fitted to the Markov model 
of dependent probabilities. 


4. Under simplifying conditions, the actu- 


ality-measure becomes an indicator of the 
amount of bundling (or correlation) among 
the sources of information to which the re- 
Spondents have exposed themselves. 


5. The bundling of sources of information 


increases with the subjectively felt importance 
of the issue to which the question relates and 
also—in general—with the socio-economic 
and/or the educational level of the respond- 
ents. It decreases with the subjectively felt 
difficulty of the issue. 


Received August 20, 1952. 


a 


: Block, J. and Block, Jeanne. An investigation of 


+ Chesire, L., Saffir, M., and Thur: 


References 
the relationship between intolerance of ambi- 
guity and ethnocentrism. J. Pers, 1951, 19, 
303-311. 

Brunswik, E. Systematic and represe t 
sign of experiments. Berkeley: Univ. of Calif. 
Press, 1947. 


resentative de- 


stone, L. L. 
rachoric cor- 


C i i ‘or the tet 
omputing diagrams fe of Chi- 


relation coeficient, Chicago: Univ. 
cago Bookstore, 1933. 

Edwards, A, L. Political frames of 
a factor influencing cy gon: 
soc. Psychol., 1941, 36, 34-61. i s 

Feller, w An introduction to preps theory 
and its applications. New York: Wiley, 1950. 


reference as 
J. abnorm. 


6. 


10. 


11. 


14. 


an 


19. 


20. 


. Hofstaetter, P. R. 


. Link, H. C. The Psychological 


287 


Frenkel-Brunswik, E. Intolerance of ambiguity 
as an emotional and perceptual personality 
variable. J. Pers., 1949, 18, 108-143. 


- Hofstaetter, P. R. Die Psychologie der oeffent- 


‘lichen Meinung. Wien: Braumueller, 1949. 


. Hofstaetter, P. R. The actuality of questions. 


Int. J. Opin. Attitude Res., 1950, 4, 16-26. 
Importance and actuality. 
Int. J. Opin. Attitude Res., 1951, 5, 31-52. 
Hofstaetter, P. R. Einjuehrung in die Sozialpsy- 

chologie. Wien: Humboldt, 1953. 

Janis, I. L., Lumsdaine, A. A., and Gladstone, 
A. I. Effects of preparatory communications 
on reactions to a subsequent news event. 
Publ. Opin. Quart., 1951, 15, 487-518, 


. Levine, J. M., and Murphy, G. The learning 


and forgetting of controversial material. J. 

abnorm, soc. Psychol., 1943, 38, 507-517. 

Corporation's 

index of public opinion. J. appl. Psychol., 
1946, 30, 297-309. 

Link, H. C. The ninety-fourth issue of the Psy- 
chological Barometer and a note on its fif- 
teenth anniversary. J. appl. Psychol., 1948, 


32, 105-117. 


. Link, H. C., and Freiberg, A. D. The ninety- 


seventh Psychological Barometer. J. appl. Psy- 


chol., 1948, 32, 443-451. 


. Link, H. C., and Freiberg, A. D. The Psycho- 


logical Barometer on Communism, American- 
ism and Socialism. J. appl. Psychol., 1949, 
33, 6-14. 


. Millson, W. A. D. Problems in measuring audi- 


ence reaction. Quart. J. Speech, 1932, 18, 


621-637. 


. Postman, L. Toward a general theory of cog- 


In J. H. Rohrer and M. Sherif (Eds.), 
the crossroad. New 


nition. 
Social psychology at 
York: Harper, 1951. 
Saenger, G., and Gordon, N, S. The influence of 
discrimination on minority group members in 
its relation to attempts to combat discrimina- 
tion. J. soc. Psychol., 1950, 31, 95-120. 
Werner, H. The concept of rigidity; a critical 
evaluation. Psychol. Rev., 1946, 43, 43-52. 


Tue JOURNAL OF APPLIED PSYCHOLOGY 
Vol. 37, No. 4, 1953 


A Biasing Factor in Essay Response Frequency 


Erwin K. Taylor and Dorothy E. Schneider 


` 
Personnel Research Institute, Western Reserve University 


In order to get descriptive phrases for use 
in developing an evaluation form, question- 
naires were mailed to 955 randomly selected 
members of a professional organization, the 
American Dietetic Association. The question- 
naire explained the purpose of the study and 
asked each respondent to describe an asso- 
ciate at a level of competence specified on the 
form. Each questionnaire specified a single 
level out of ten possible levels. Ninety-five 
or ninety-six questionnaires were sent out for 
each level. Of the ten levels, “1” was con- 
sidered best, and “10” poorest. Although 
there was some feeling on the part of the 
present writers that the returns, being on an 
anonymous and voluntary basis, would yield 
either a J- or U-shaped distribution, there 
was no published evidence to justify unequal 
distribution in mailing of the forms. Hence, 
each level was requested from an equal nuni- 
ber of recipients of the questionnaire. 

In addition to the descriptive essays, re- 
spondents were asked a number of questions 
regarding the type of work engaged in when 
they were acquainted with the dietitian de- 


scribed, length of acquaintance, and working 
relationship between the two. 

The questionnaires were mailed from Cleve- 
land on June 19, 1952; the first reply was re- 
ceived on June 24, and by July 10, the “dead- 
line” requested on the questionnaire form, 130 
forms had been returned. Within the two 
weeks following, 14 more were received, bring- 
ing the total to 144, or about a 15% return. 

The forms which were returned reveal some 
interesting points. The distribution at each 
level of competence is shown in Table 1, ac- 
cording to whether the respondent selected a 
subordinate, co-worker, superior, or student 
as subject for the essay. i 

It will first be noted that the distribution: 
while not horizontal, does not reveal sufficient 
U or J shape to be so named. The mean z 
5.4 is very close to the 5.5 which would be €% 
pected in a horizontal distribution. At only 
two levels, 4 and 6, are the frequencies mo" 
than 5 below the maximum frequency. 1 

A Chi-square test was applied to the tota 
frequencies and revealed no statistically 5'8 
nificant deviation from the theoretical rect” 


Table 1 


Distribution of Returns on Questionnaire Fi 


orm Requesting Descriptive Essays 


Relation of Subject to Writer of Essay 


Level of Competence Subor- Co- 
of Person Described dinate Worker Superior Student Total 
1 (Best) 6 0 j 
z : 10 2 18 
F a 3 10 0 18 
F ` 4 4 1 14 
3 2 5 5 10 
: 4 5 5 
3 i 5 ; 6 16 
7 H 3 2 0 9 
7 6 3 13 
8 ĝ 3 1 
9 ri 2 2 1 14 
10 (Poorest) 9 : 4 2 JA 
< 4 0 18 
Total 50 34 F a : 144 
Mean 5 : È 2 
5.96 6.26 435 31i am 


288 


A Biasing Factor in Essay Response Frequency 


linear distribution. This indicates the ab- 
sence of bias in the returns in terms of the 
level of competence of the individual to be 
described. 
_ Thus, while it has been conjectured that it 
Is easier to evaluate personnel at the extremes 
of the effectiveness distribution, it does not 
appear to be necessary to take this factor into 
account in planning the collection of data of 
this nature. Responses were received with al- 
Most equal frequency for ratees at each of the 
ten effectiveness levels used in this study. 
When no restriction was placed upon the 
Selection of subjects for the essays, other than 
the level of competence, about the same num- 
ber of supervisors and subordinates were se- 
lected, However, some rather striking differ- 
ences will be observed in the mean levels of 
Competence of the two groups. It would seem 


289 


that the more competent “superiors” were re- 
membered by the recipients of the question- 
naire and selected for description, while those 
who were to describe less competent individu- 
als tended to select subordinates. 

It is interesting to note that no respondent 
who was asked to describe a No. 1 person— 
the best—described a co-worker, and that 
fewer co-workers were described than either 
superiors or subordinates. Although there is 
the same proportion of co-workers as subordi- 
nates in levels 6-10, the mean is lower. This 
is largely a function of the zero frequency at 
level 1. 

The number of descriptions of students is 
small, but no unusual trends are noted in 


manner of selection. 


Received October 15, 1952. 


THE JOURNAL OF APPLIED PSYCHOLOGY 
Vol. 37, No. 4, 1953 


Rating Patterns for Maximizing Competition and Minimizing Number 
of Comparative Judgments Necessary for Each Rater 


Ray H. Simpson 


Department of Educational Psychology, University of Illinois 


The problem considered in this paper is 
represented by a situation which contains the 
following features: 

a. A large number of individuals or writ- 
ten products are to be ranked. 

b. The time required to evaluate and give 
a rank to each individual or product is of 
such magnitude that any single judge cannot 
be asked to rank more than 3, 4, or 5 indi- 
viduals or products. 

c. The researcher or administrator wants to 
use a large number of judges in order to maxi- 
mize the reliability of the combined rating or 
ranking. 

d. Each individual or product should di- 
rectly compete with as many other individu- 
als or products as possible. 

e. The total group of individuals or prod- 
ucts with which a particular individual or 
product competes should represent a random 
sample of the total group of competitors 
against whom he is to be ranked. 

The method of paired comparisons would 
obviously not be appropriate since it would 
take too much time and be too fatiguing to 
the judges. Forty individuals or products 
to be ranked, for example, would involve 
n(n — 1)/2 or 780 pairs to be compared by 
each judge. i 

Consideration was given to the possibility 
of using a technique suggested by Guilford? 
which involves selecting from all the individu- 
als or products a limited number to become 
the basis for a scale. This alternative was 
not used because of the difficulty in selecting 
appropriate products or individuals at ap- 
proximately equal intervals along the scale 
and because of the volume of work still re- 
quired of each judge. 

The technique described by Uhrbrock and 
Richardson * was also considered. In it one 

1 i u P. ji i 
York: Patan Ei a New 


2 Uhrbrock, R. S. and Richardson, M y 
analysis. Personnel J., 1933, 12, 1412154. W. Item 


methods, 


290 


breaks the individuals to be rated into four 
groups. Within each group each man is com- 
pared with every other man in his group and 
also with each of five “key” men. These lat- 
ter five form a type of interlocking yardstick. 
While this method is much more economical 
of the time of judges than the paired com- 
parison technique, it still was too time con- 
suming to meet all of the requirements of the 
writer’s situation. As a result, the patterns 
described below have been developed. 


Rating Patterns 


Table 1 illustrates a pattern to be used 
when: (a) there are 21 individuals or prod- 
ucts to be ranked from highest to lowest: (b 
there are to be 21 raters (A through U); (c) 
each individual or product is to compete wit 


Table 1 


Interlocking Design for Having 21 Raters Give Com” 


petitive Ranks to 21 Individuals or Products 


Rater Individuals or Products to be Rated 
A 1 2 7 9 19 
B 2 3 s 0 2 
c 3 4 9 u 21 
D 4 5 10 12 1 
E 5 6 un g8 2 
b 6 7 2 14 3 
G 7 8 B 15 4 
H 8 9 14 16 5 
I 9 10 15 17 6 
J 0 Wm 6 18 7 
Ks uo å n2 7 (19 8 
v 2 as is 2 9 
M B w go a 1 
N 4 15 20 1 11 
2 1 16 a 2 1 
A 6 à 7 1 3 8 
Q 17 18 2 4 14 
x is 1 3 5 É 
a 9 æ 4 6 
r 2 a s 7 # 
u 21 1 6 s B 


Abbreviated Rating Patterns with Each Judge Rating Five Individuals or Products 


Rating Patterns 


Table 2 


291 


Total Number to be Rated—25 


Total Number to be Rated—29 


Judge Rates Judge Rates 
5 1 2 4 8 1 A I 2 4 8 48 
B 2 3 5 9 1 B zo 8 5 © 49 
E 3 4 6 0 18 € 3 4 6 10 2 
y 3 a 1i 5 B AA 7 8 tt 5 15 
= m4 a25 6 14 BB 23 9 2 6 16 
y 25 1 3 7 B cc 2 1 3 7 WT 

Total Number to be Rated—26 Total Number to be Rated—30 

Judge Rates Judge Rates 
a i ; 4 & B A i Z & #8 2 
B >z g & g & B 2 3 5 9 B 
¢ 3 4 6 10 15 € 3 4 6 10 24 
x 5 5 10 BB 3 9% 1 5 19 
4 i a 6 nun cc 2» 3 2 #46 20 
2 26 1 3 7 R DD 3 1 3 7 X 

E o a a 
Total Number to be Rated—27 Total Number to be Rated—31 
a umber 

Judge Rates Judge Rates 
Fs 4 8 1 A I 2 4 € We 
B A ; 5 99 8 B 2 3 5 9 1 
8 3 4 6 10 19 Cc 3 4 6 10 18 

i s 2% 30 it 5 B 
Y 5 4 cc 
Z a G i 6 B DD 30 å 32 2 6 M4 
AA = á ; 7 16 EE 31 1 3 7 15 
i 
T eared 28 Total Number to be Rated—32! 
‘otal Number to be Rated 

Judge Rates Judge Rates 
A - s 13 A 1 2 4 8 16 
B 1 2 v 9 4 B 2 3 5 9 17 
c 2 3 w 5 c 3 4 6 0 18 

f 3 i : : 
i j 30 å 3 å 1 5 B 
10 DD 
Md 2 27 1 : ad EE 3 å 32 2 6 M4 
BR mo BS 7 12 FF 32 1 3 7 15 
* 28 1 
ed with groups larger than 32, 


ig. = 
"milar designs with the same init 


jal sequence can be usi 


292 


as many other individuals or products as pos- 
sible; and (d) no judge or rater is to be asked 
to rank more than five individuals or prod- 
ucts. The time involved in the job of rating, 
reading, or observing necessitates this last re- 
quirement. 
_ Examination of Table 1 will show that in- 
dividual or product number 9, for example, is 
in competition with each of the 20 other num- 
bers. This means that the final rank for 
number 9 is as fair a rating as one can get 
without excessive work on the part of indi- 
vidual judges. 

This particular design was used by the 
writer in a class of 21 students where each 
student studied and rated, on the basis of 


Ray H. Simpson 


criteria previously established, the learning 
products of five peers. For each student to 
rank more would have been too fatiguing- 
The sum of all ranks assigned a particular 
individual was used to rank the 21 students 
in the class. Other patterns, based on simi- 
lar designs, are represented in essence in Ta- 
ble 2. 

The technique and patterns suggested here 
have wide applicability. A few of the addi- 
tional areas where use would be feasible would 
be the following: ranking personnel such as 1 
an industrial, military or educational situa- 
tion; rating short stories; and ranking art 
forms for esthetic qualities. 


Received September 2, 1952. 


THE JOURNAL or A L 
S NA APPLIED Psy ad 
Vol. 37, No. 4. 1938 D PsycnoLocy 


Some Primary Ratable Characteristics of Instructional Films ' 


Philip Ash* and Thelma R. Hobaugh 


The Pennsylvania State College 


A central problem facing the instructional 
Im producer on the one hand, and the teacher 
On the other, is the evaluation of the film as 
to its adequacy as a teaching aid. 
fi he primary objective of an instructional 
Im is, generally, to instruct. Therefore, the 
first Criterion of the adequacy of a film as an 
‘nstructional device must be: how well does 
y € film communicate its content—how much 
'S learned from it? 
lowever, a second important objective of 

T Instructional film is to interest, to hold at- 
tention, possibly to entertain in the broad 
Sense. In some of its uses, for example as an 
Overview to a unit of instruction, the teacher 
Might intend that the film motivate the stu- 
Bee to want to learn about the content. with 
ttle emphasis on information directly ac- 
ada from the film. Furthermore, it seems 
‘Kely that the extent to which a film is “in- 
testing” in the sense that it gains and holds 
ex; tttention of the students will influence the 
Xtent to which it communicates effectively. 
an hree kinds of estimates of the quality of 
sel Instructional film therefore suggest them- 
lee first, measures of learning based on 
rati results; second, estimates (judgments, 
to ings) by people who have seen the film as 
th the extent to which people will learn from 
-© film; and third, ratings of the affective 


i : 3 
Mpact of the film more or less independently 


nabe i i icle is based was 
C research on which this article 1s i 
Fip dUcted under the auspices of the Instruction” 
269 esearch Program under Contract Nee NK 
of th ask Order VIT with the Special Devices i< es 
DoW © Office of Naval Research. This research ict m 
Pron in Technical Report SDC 269-7-23, r- 
issi es. The authors gratefully acknowledge per 
out ion to analyze the correlation matrix that crol ea 
W, i the Ph.D., dissertation completed by Lora! he 
4 rofos E A Comparison of Methods for Measure 
Penne’ Of Learning from Instructional Films, Ne 
i 33 psylvania State College, 1951;, Publication or 
ichi University Microfilms, $2-76. Ann / ol 
stug čan. Dr. Ash was responsible for eer me 
spor; @2d writing the report; Miss Hobaus! 
2 Sible for carrying out the factor analysis. | ÜRE 
latig r. Ash is now a member 0 the Ce ene 
East Opesearch Department, Inland Stee! 


cago, Indiana. f 293 


of its effectiveness in imparting factual in- 
formation. 

It is the purpose of the present research to 
determine the extent to which these kinds of 
estimates are intercorrelated, to answer the 
question as to the extent to which, from a 
rating of teaching effectiveness or affective 
quality, it is possible to predict measured 
learning. 

A rough check on the teaching effectiveness 
of an instructional film may be made by ad- 
ministering, both before and after the film, a 
test generally covering the content of the film. 
Gains in test scores would reflect the over-all 
contribution of the film to the students’ fund 
of knowledge. Such a test, however, would 
not clearly indicate which parts of the film 
were effective and which were ineffective. 

A better method of evaluation might in- 
volve the development of a profile measure- 
ment which shows which parts of the film are 
effectively imparting information and which 
are less effective or ineffective in this respect. 
Furthermore, technics are available for ob- 
taining rating profiles of the “attention value” 
and other affective attributes of a film, and 
for obtaining judgments of teaching value. 
Profile methods of summating audience rat- 
ings with respect to successive sections of 
programs (entertainment films, radio shows) 
have been developed extensively during the 
past ten years, primarily in the radio and 
motion picture industries.* Little comparable 
work has been done with instructional films, 


however. 
Film Profile Analysis 


Under the auspices of the Instructional 
Film Research Program at The Pennsylvania 
State College, Dr. Loran C. Twyford con- 
ducted an extensive study of the feasibility of 
a variety of methods for developing profiles 
of audience evaluations of the instructional 


3 An extensive review of profile analysis technics 
in radio and motion pictures is included in Dr. Twy- 


ford’s study, op. cit. 


294 


film. These profiles were graphs, drawn 
against time as the base-line, of the summed 
responses of the audience to several dimen- 
sions of the film. j 

The profiles he collected may be classified 
into three groups: 

(1) Profiles of measured learning and learn- 
ing gains. These profiles were based upon an 
exhaustive test of the film content. The test 
items were true-false form; every fact pre- 
sented in the film was covered by an item, 
and each item could be related back to a spe- 
cific point (in time, sequence, and footage) 
in the film. 

(2) Profiles of estimated teaching effective- 
ness. For these profiles the subjects were 
asked: (a) to rate whether they thought they 
were learning; (b) to rate whether they 
thought students would learn; or (c) to rate 
(on a third showing) whether they had 
learned the content presented. The ratings 
were made during the film showing, on a five- 
point scale. The subjects made their re- 
sponses by means of the Film Analyzer * sys- 
tem, which provides for the collection by 
polygraphic recording of the responses of each 
of a group of 40 subjects. 

(3) Profiles of affect dimensions. Profiles 
were collected on the dimensions: (a) “Like- 
Dislike the film”; (b) “The film is Clear-Un- 
clear”; and (c) “The film is Good-Bad.” 
These profiles also were made on five-point 
scales and the judgments were recorded by 
means of the Film Analyzer system concur- 
rently with the viewing of the film. 


Procedures 


In all, Dr. Twyford collected data on 276 high- 
school and college students, divided into seven 
equated groups. 

The students in the experiment met as intact 
classes. A given class was then randomized into 
seven groups. Each of the seven groups in a 


4 At each of the 40 station: 
Analyzer system a box contain: 
is located. Each subject pres 
response, and a simultaneou. 
sponse channels is made on 
five choices are coded so that each response channel 
produces dashed, solid, or a pair of dashed and solid 
lines identifying the key being Pressed. For a de- 
scription, see Carpenter, C. R., et al. The Film Ana 
lyzer. Special Devices Center Technical Report 269- 
7-15, Instructional Film Research Program The 
Pennsylvania State College, 1950, : £ 


s (chairs) of the Film 
ing five piano-like keys 
ses a key signifying his 
s record of the 40-re- 
a moving paper. The 


Philip Ash and Thelma R. Hobaugh 


class followed a different one of seven experi- 
mental procedures. A 
The film shown to the experimental population 
was a 10-minute section of a longer film on pre- 
cision measuring instruments. The original film 
had been designed by the Instructional Film Re- 
search Program. It is essentially an informa- 
tional film which undertakes to show various 
kinds of precision measuring instruments aN 
their method of use. The 10-minute section 
used in the experiment was a complete and in 
tact section of the original. R g 
Groups I, II, and III each saw the film ee 
times, and made a continuous rating on each ver 
casion. Group 1V took Form A of the test iter 
seing the film once, Group V took Form B Fete 
seeing the film once. Groups VI and VII ne 
Forms A and B respectively without seeing 
films. g 
The experimental directions were (in part) 
follows: . iyen 
1. “I am learning.” Specific instructions giv 
to students making this rating were: | 
“You are to rate the film on the basis of t 
learning from the film. When facts are presen i 
by the film that you already know, you i in 
report that you are not learning. Some fac not 
the film may be so difficult that you could ke 
answer questions about them if you were are 
to do so after the film showing. These en ange 
not learned. Learning means that some A ; 
has occurred to you that we could discove the 
comparing your knowledge before and after 
film showing.” g 
2. “I predict learning.” Specific instru 
given to students making this rating were: you 
“You are to predict the amount of learning igs- 
would expect a group composed of your Gln 
mates to learn from a single showing of the te? 
The group will not be learning much i 


your 


ction® 


f the me 
tial being presented is previously known. 
group will not be learning if the material ut 
Presented is too difficult. Learning may ° ew: 
when the film presents facts that are 
Learning means that some change has oom? 5 
to the person that we could discover by ©° how 
ing his knowledge before and after the film 5 
ing by means of tests,” js val 
ing vc, likecdislike.””° Students making th 
mg were given the following instructions: 
“You are to rate the film, according tO a 
Bree to which you like the parts of the fi a 
you like a part very well, depress key No. No. ue 
you dislike a part of the film, depress key sed | 
The keys that lie between should be pres each 
show just how much you like or dislike g the 
Part. It will be possible for us to imp% the 
film by making the entire film similar e alm 
Parts that you indicate you like. Rate e 
from beginning to end.” ` to m 
uring a second showing of the am, ond 
same groups, each group was asked to T° 


he 


Primary Ratable Characteristics oj Instructional Films 


hee time, using the same judgment that had 
een used the first time. 

During the third showing, the students in one 
oe were asked to rate how much they had 
pag A in the second group they were asked to 
rs iS how clear the film was; in the third group 

ney were asked to rate the excellence of the film 
as a teaching aid. 


The remaining four groups of each class were 


handled as follows: 
wht group took form A of a true-false test 
t nich exhaustively sampled the information con- 
ained in the film (in both pictures and com- 
mentary), 
ae second group took form B of the same test 
i rue questions of form A were false questions 1n 
orm B; false questions in form A were true 
questions in form B). 
5 hese two groups acted as a control on the 
retilm knowledge of the experimental popula- 
lon; they did not see the film. 
ae two remaining groups received no pre- 
w S. They were each shown the film, and then 
cre respectively given one of the two test forms 
Mentioned above. 
sane nine rating profiles (Groups 
ach established by taking the average 
151 group for every two feet of film, 
points for each profile. . . 

The test questions were distributed approxi- 
peels evenly to provide a measure in each in- 
ma. The four basic test profiles (for Groups 

VIL) were prepared by calculating the mean 
Score for each point on the time axis. Some 
ftestions were used at more than one point in 
he film. These four basic profiles generated a 
variety of profiles of learning gains (difference 
etween a post-test profile and an appropriate 
Pre-test profile). 

he list of the profiles is as follows: 

of elapsed film, 


I-III) were 
rating for 
a total of 


l. Footage—number of feet 
>, Measure of the time base. 
- I am learning—Ist showing: 

tent of learning during first V 
3 by Group I. 

- Like-Dislike—1st showing: 
to which viewer liked tip Es 
Viewing of film by Group 44., ; 

redicted learning—1st showing: Ratings of 
the extent to which one’s classmates you 
san the content, during the first viewmg o 

the film by Group III. 
pi am learning—2nd Coen As profile 
2, during second viewing DY roup I. r 
a. Kike-Dislike—2nd showing: AS ton profile 3, 
„ during second viewing by GrouP +y g 

Predicted learning—2nd showing: As fir peg 
file 4, during second viewing by Crain om 
Rated end knowledge—3r4 showing: J oie 
of amount learned during previous show k 

uring third viewing by GrouP * 


Ratings of ex- 
iewing of film 


Ratings of degree 
during first 


295 


9. Clarity: Ratings of the extent to which the 
film presentation was clear and understand- 
able, during third viewing by Group II. 

10. Good-Bad film: Ratings of the excellence of 
the film as a teaching aid, during third view- 
ing by Group III. 

11. Test A learning: Mean score of Group IV 
minus mean score of Group VI (control or 
no-film group for Form A) for each profile 
point. 

12. Test B learning: Mean score of Group V 
minus mean score of Group VII (control or 
no-film group for Form B) for each profile 
point. 

13. True questions learning: Mean score on true 
questions from Groups IV and V minus mean 
score on true questions from Groups VI and 
VII. 

14. False questions learning: As for profile 13, 
but for false questions. 

15. Comparable questions learning I: Mean score 
for Group V minus mean score for Group VI, 
for comparable questions. 

16. Comparable questions learning II: As for 
profile 15, Group IV minus Group VII. 

17. Maximum learning: Profile obtained by as- 
suming maximum learning. 

18. Terminal knowledge, Form A: Group IV 
mean scores for profile points. 

19. Terminal knowledge, Form B: As for profile 
18, for Group V scores. 

20. Terminal knowledge, true questions: Group 
IV mean scores and Group V mean scores on 
true questions only. 

21. Terminal knowledge, false questions: As for 
profile 20, for false questions. 


Purpose of Present Analysis 


Dr. Twyford was primarily interested in 
developing profile construction methodology, 
and in comparing the relative efficiency, with 
measured learning as a criterion, of the af- 
fect rating profiles and the estimated teach- 
ing effectiveness profiles for detecting the 
strong and weak points of the film. Part of 
his analysis, therefore, involved correlating 
the profiles. This was done by taking, for 
each profile, a measurement at every two 
feet of film length (151 measurements in all) 
and intercorrelating the profile arrays.’ 


5 Before further analysis, Dr. Twyford applied a 
series of corrections to his correlations. These cor- 
rections were not employed in this analysis. These 
corrections for “drift,” “lag,” and “carry-over” in- 
volved either smoothing lines or partialling out time. 
They were not applied in this factor analysis; first, 
because they had little effect on the observed cor- 
relations; second, because the partial correlations 
would be taken care of in the factor matrix; and 


Table 1 


Intercorrelation Matrix for Twenty Profiles and Film Footage 


21 


6 TES 10 11 12 13 14 15 16 17 18 19 20 
—05 —11 —07 —03 —04 —01 —13 —23 —21 -2 -23 
0o 05 (04 


i 2 3 4 5 
33 41 49 30 29 50 0602 8 


1. Footage 


2. I am Learning—1st Showing 33 65 8990 48 91 37 55 89 33 20 27 25 2 30 12 9 
3. Like-Dislike—1st Showing 41 65 76 59 67 71 5862 73 03 02 —01 05 O 07 —02 —06 —02 —01 —09 
4. Predicted Learning—Ist Showing 49 89 76 7 S ayra o y OF 14 12 10 17 —01 —11 —06 -07 -12 
5. I am Learning—2nd Showing 30 90 59 76 47 86 36 52 8& 35 2 25 32 27 35 21 07 14 10 11 
6. Like-Dislike—2nd Showing 29 48 67 55 47 58 55 50 58 11 03 OF 13 09 16 —01 —02 02 02 —07 
7. Predicted Learning—2nd Showing 50 91 71 9 86 58 37530 M 7 12 19 23 16 28 06 —07 —01 —02 —06 
8. Rated End Knowledge—3rd Showing 06 37 58 39 36 55 37 78 50 15 15 12 16 08 15 04 09 12 10 05 
9. Clarity 02 55 6% 53 52 50’ 53) °78 69 16 15 12 15 12 yo n 11 14 16 05 
10. Good-Bad Film—3rd Showing 43 89 73 8985 58 94 50 69 25° 18 20 24 20 29 08 —Ol 05 08 00 
11. Test A Learning —05 3. 03 1735 1 27 1516 25 74 82 90 7 733 71 6O 59 55 65 
12. Test B Learning —11 20 02 024 03 12 15 15 1 74 85 84 77 66 75 62 69 56 72 
13. True Questions Learning —07 27 01 14 25 07 19 12 12 20 8&2 85 72 70 67 68- S4 55 W 5 
14. False Questions Learning —03 26 05 12 32 13 23 1615 24 90 & 72 w B 7 6 O68 62 72 
15. Comparable Questions Learning I —04 22 04 1027 09 16 08 12 20 71 77 70 76 72 7 SO 52 50 61 
16. Comparable Questions Learning II —01 30 07 17 35 16 28 15 17 29 78 66 67 75 72 a1 S 2 S752 
17. Maximum Learning —13 12 —02 —01 21 —01 06 04 07 08 71 75 68 77 75 71 8&4 85 830 %4 
18. Terminal Knowledge, Form A —23 02 —06 —11 07 —02 —07 09 11 —01 60 62 54 66 59 50 84 8 %2 88 
19. Terminal Knowledge, Form B —21 09 —02 —06 14 02 —01 12 14 05 59 69 55 68 52 62 85 85 8&7 86 
20. Terminal Knowledge, True Questions —22 05 —01 —07 10 02 —02 10 16 03 55 56 49 62 50 57 80 92 87 77 
11 —07 —06 05 05 00 65 72 59 72 ól 52 84 88 86 77 


21. Terminal Knowledge, False Questions —23 04 —09 —12 


ae 


967 


y3noqoH ‘YJ vuygayy, pun ysy quyd 


i) 
O 
~ 


Primary Ratable Characteristics of Instructional Films 


Table 2 
Factor Loadings Before Rotation (Fo) 

Variable I Il ul Iv v VI VII h? 

1 156 —.439 —.241 .140 =.334 153 094 438 

2 682 —.573 —.235 199 .205 —.058 —.077 939 

3 504 —.668  —.204 045 = —.188 14 029 793 

4 582 —.700 —.189 157 —.040 094 —.123 915 

x 094 —.493 —.192 .188 165 —.109 .082 843 

6 A506 sS „180 —.192 —.231 —.060 055 599 

$ 653 —.666 —.265 188 —.036 —.075 —.012 983 

8 463 —.397 426 4 134 —.044 .020 .762 

9 2538 —.470 376 —.258 .266 —.058 —.064 .796 

10 .696 —.659 —.119 094 -092 026 034 952 

11 756 419  —.272 = 163 033-168 = 119 891 

12 708 501 —.165 —.188 :133 .221 -100 .892 

13 680 429 —.274 —.200 137 148 —.173 $32 

14 “774 463 —.180 —.133 —.038 —.100 .099 885 

15 672 420 — 213 —.155 —.125 131 .050 TAL 

y 16 716 352 —.202 —.107 —.088 —.221 077 752 

17 697 608 .069 143 —.086 054 .079 897 

18 599 620 “360 "233. —.025 Oa =J ‘048 

19 638 576 308 .205 070 —.078 «136 905 

20 “300 557 377 241 —.079 —.147 —.140 918 

21 600 610 .206 182 135 .128 104 .890 

Table 3 
Rotated Factor Matrix (V = FA) 
r i Terminal Terminal 
ee ary Knowledge Knowledge 
Var Factor Factor Factor Factor (1) IV’ G 

iable ar) GIT’) a (V’) b (VII) av’) VI’) 
1 = 34 ms 006 -062 —.231 (223 
2 on) 5 136 —.095 —.080 175 —051 
106 : ; 77 
3 i 746 —.026 .068 -001 —.019 .177 
4 —.010 “935 "234 031 —.140 190 133 
5 -010 pE “4 —.037 078 —.158 —.103 
a 175 ‘678 — 111 392 .036 51 —.019 
7 -057 “933 268 —.013 —.033 —.214 —.031 
8 .085 685 —.022 423 029 469 —.064 
S .027 “788 “199 340 —.047 313 —.094 
io .020 a 314 059 .026 —.090 ` 044 
i 084 “055 088 i022 —.121 -259 —.164 
12 .787 ‘ao "133 .005 -111 -309 .197 
3 780 oe 101 —.095 —.166 305 136 
14 -709 “ia .087 .094 -097 -231 —.098 
is 850 ae _018 079 043 224 146 
16 773 E 054 .083 .069 -180 —.206 
17 a 063 273 “300 .090 —.024 029 
18 -928 L057 391 484 —.108 —.095 —.012 
19 Be — "003 433 ‘406 163 —.057 — 138 
20 838 ae 367 524 119 116 192 
839 —.010 7 .280 134 —.028 063 
21 830 —.091 -42 g 


298 Philip Ash and Thelma R. Hobaugh 
Table 4 
Cumulative Transformation Matrix (A) j 
Variable Y 1’ TI’ Iv’ v vr vit 

I .001 735 379 089 303 —.003 290 

II 018 625 —.777 .102 000 .039 000 
HI 034 .022 223 056 162 —.072 828 
IV 012 .096 .000 —.978 718 —.028 000 

v .058 —.244 102 150 005 —.126 —.480 

VI O11 .000 .000 .000 000 988 000 
VII 997 .000 000 .000 000 000 000 

Table 5 
„Cosine Matrix (C = AA’) 

Variable P Ir mr’ Iv’ v vr vil’ 
r 1.000 .000 .000 .000 .050 .000 000 

1 .000 1.000 —.080 .003 148 .000 348 
oe a =S 1.000 -000 .273 .000 300 
wy ON) 003 .000 1000 = 576 ‘000 ‘000 
y .050 148 273 —.576 1.000 —.108 —.068 
ps 000 000 .000 000 —.108 1.000 000 

I .000 348 304 .000 —.068 .000 1.000 


It is the purpose of the present study to 
analyze the resulting correlation matrix, to 
determine whether the domain as tapped by 
the twenty profiles can be readily reduced to 
a fewer number of dimensions. It is obvious 
that the measured learning profiles at least 
must intercorrelate fairly highly, since they 
were all based on the same instruments. It 
was the initial hypothesis of this analysis, 
therefore, that the whole matrix could be re- 
duced to three factors: a measurement fac- 
tor, with loadings on the tests; a learning 
factor, with loadings on both the tests and 
the estimated teaching effectiveness ratings; 
and an affective rating factor. , 


Results 


Table 1 presents the intercorrelation matrix 
A centroid analysis was applied, yielding the 
(unrotated) solution given in Table 2. Rota- 
tion yielded the final factor matrix in Table 
3. The transformation and cosine matrices 
are given in Tables 4 and 5, 


third, because the corrections were a 
the entire correlation matrix, bi 
tors of it. 


Pplied not t 
ut to different we 


Very early in the rotations, two factors 
practically orthogonal to each other with lars 
loadings emerged. One included all the test 
profiles (IT), the other included all the rat- 
ing profiles (III’). For neither of these, 1% 
for any other factor, was the projection ° 
film footage (a measure of the time base 
significant, indicating that these were 1° 
simply functions of sequence. 

Repeated rotational efforts failed to lead t° 
the identification of any other significant fac 
tors. There is a suggestion of overlap Pe 
tween ratings of learning and measured ag 
Ing in factors V’ and VII’, but the loadings a 


too small to warrant any large degree of CO” 
fidence. . 


Summary 


This analysis suggests that test-measured 
learning gains from a film are more OF ° s 
independent of ratings of learning or rating” 
Of affective qualities of the film. Furth? r 
more, the results indicate that all the tat? 
scales used have a large common-factor ¥ m 
ance. Whether viewers rate on an “ E 
earning” scale, a “Like-Dislike” cal? 


| 


Primary Ratable Characteristics of Instructional Films 


“Good-Bad Film” scale, or any similar scale, 
much the same function seems to be rated. 
The findings of this analysis do not agree 
Completely with the conclusion of the original 
study that a combined test measure is pre- 
dicted to a significant extent by ratings on 
the scale “I am learning.” Examination of 
the correlation matrix before and after cor- 
rections for “lag,” “drift,” and “carry-over” 
(see footnote 5) shows that correlations be- 


299 


tween this rating scale and the test score pro- 
file were uniformly stepped up as a result of 
the corrections. It is possible that inclusion 
of these corrections would have led more defi- 
nitely to the establishment of a common 
learning factor, with saturations both on the 
tests and on the scales rating learning func- 


tions. 


Received October 27, 1952. 


Tue JOURNAL or APPLIED PSYCHOLOGY 
Vol. 37, No. 4, 1953 


The Relation of Light Intensity to Accuracy of Depth Perception 


A. S. Edwards 
The University of Georgia 


The following experiments were performed 
to discover the relation of light intensity to 
the accuracy of visual depth perception. 


Procedure 


In the first three experiments two lights 
were used, one of 50 fc, the other of 12 fc. 
In the fourth experiment, three intensities of 
light were used, 12 fc, 50 fc, and 100 fc. 
With all lights, shadows were eliminated. 

Apparatus. The apparatus was a modified 
Howard-Dolman depth perception apparatus. 
Instead of the upright poles, two blocks of 
wood were substituted. These blocks were 
made to hold rectangles of heavy pasteboard 
on which were pasted gray or colored paper. 
The size of the stimulus thus presented was 
114” by 214". The headrest was at 20 feet 
distance and kept the S from seeing the tops 
of the blocks. A standard gray (Stoelting 
gray No. 19) was always used. The movable 
stimulus had one of the Stoelting grays or 
colors as follows: gray No. 19, blue No. 13, 
green No. 8, yellow No. 4, or red No. 1. In- 
tensities of lights were checked by means of 
a new light meter from the Physics Depart- 
ment and also by means of one from the 
local electric power company. New papers 
were purchased for the experiments. 

Subjects. The Ss were students chosen at 
random from classes in the University. Only 
those with at least 20/20 vision, or vision cor- 
rected to 20/20, were used. The Ortho-Rater 1 
was used to check vision. Color blind Ss were 
eliminated. 

Instructions. Ss were told, “The object of 
the experiment is to see how accurately you 
can judge distance by aligning a movable ob- 
ject with a stationary one. You are to take 
these strings and pull the right hand object 


1 Bausch and Lomb Optical Company. Sta 
practice in the administration of the D oot 


Lomb occupational vision test with the Ortho- 
1944, pp. 1-5. eee Raber 


300 


so that it is exactly even with the stationary 
gray. The right hand string pulls the mov- 
able object forward and the left hand string 
pulls it backwards. When you think the 
movable stimulus is exactly even with the 
stationary one, drop the s‘rings. You may 
move the right hand object back and forth 
as much as you want to. Keep your chin on 
the chin rest.” , 

Experiments 1, 2, and 3. The first series 
was run with 40 Ss with 50-fc light, and with 
40 Ss with 12-fc light. In the second series 
there were 30 Ss who were used as both con- 
trol and experimental Ss with the 72-fc light 
first. The third series had 30 Ss used as both 
control and experimental Ss, but with the 50- 
fc light used first. 

Experiment 4. This experiment was car- 
ried out under the direction of the author by 
Mr. E. L. Franklin, a graduate student: 
There were 50 Ss and a 100-fc light was Us 
in addition to the 12-fc and 50-fc lights. The 
order of lights was alternated. 


Results 


On the average it appears that depth pet 
ception is more accurate with the greater 
amount of illumination. But in the ne 
three experiments, it is only in Experiment 
that this conclusion is substantiated by * 
critical ratio that indicated significance * 
the one per cent level. See Table 1. s 

In Experiment 4, the averages are better : 
the light intensity increases. But only t? 
difference between the 12-fc and the a 
lights gave a critical ratio that was sig” 
cant at the one per cent level. See Table re 

Individual Results. It appears that seit 
“= are more accurate with less rather t Kes 
with more light. In the first three ¢xP® 
ments, the per cent of Ss who did better Ti 
12-fc than with 50-fc light were as jailer. 
(a) Experiment 1, 25 per cent; (b) Exp? 


“1> 


Light Intensity and Accuracy of Depth Perception 301 


Table 1 


Accuracy of Visual Depth Perception with Two Intensities of Illumination 


Ji a . A . z 
Note: The standard (stationary) stimulus object was at 50 cm. on the scale. M shows the mean on the scale 


50 fc 12 fc 
Exp. Ss Mem. S.D. Mcm. S.D. CR. 
1 so o: 52.47 3.72 54.72 3.1 31 (1%) 
2 30 53.07 3.76 54.25 27 1.42 
3 30 55.71 4.35 56.85 3.45 1.09 
Table 2 
Accuracy of Visual Depth Perception with Three Intensities of Illumination 
Note: The standard stimulus was at 50 cm. and the M gives the average on this scale. 
50 fc 12 fc 100 fc 
Exp. Ss M cm. S.D. M cm. S.D. M cem. S.D. 
4 50 (1) 56.6 1.92 (2) 56.9 1.4 (3) 55.5 2.88 


C.R.: 1-2 = 0.93; 1-3 = 2.29 (5%); 2-3 = 3.18 (1%). 


ment 2, 30 per cent; and (c) Experiment 3, 
O per cent. 
_ In Experiment 4, which used three intensi- 
ties of light, the per cent of judgments that 
Were better with the light of less intensity 
are as follows: (a) 20 per cent better with 
2 fe than with 100 fc; (b) 20 per cent better 
With 50 fc than with 100 fc; and (c) 22 per 
Cent better with 12 fc than with 50 fc. 
th n our four experiments, it thus appears 
at a considerable number of Ss judge depth 
etter with less rather, than with greater 1n- 
€nsities of light. 


Summary 


1. In terms of averages alone, it appears 
that Ss were more accurate in depth percep- 
tion with greater illumination rather than 
with less. 

2. Analyses of our data, however, show 
that from one-fifth to one-fourth of our Ss 
were more accurate with less rather than with 


more intense illumination. 
3. It appeared that no one intensity of 
illumination was optimal for all Ss. 


Received November 3, 1952. 


Tue JOURNAL OF APPLIED PSYCHOLOGY 
Vol. 37, No. 4, 1953 


Instrument Reading III: Check Reading of Instrument Groups 


W. J. White, M. J. Warrick, and W. F. Grether 


Psychology Branch, Aero Medical Laboratory, Wright Air Development Center, 
Wright-Patterson Air Force Base, Ohio 


Most previous studies (1, 3) of instrument 
reading have been concerned solely with pre- 
cise quantitative readings. However, it has 
been pointed out by Grether (2) that there 
are many situations where instruments are 
read only to obtain assurance of a null, nor- 
mal or desired indication (check reading), or 
for an indication of the direction in which an 
instrument pointer deviates from a desired 
value or position (qualitative reading). Mil- 
ton, Jones, and Fitts (4) report that the aver- 
age duration of the visual fixations of a pilot 
flying on instruments is approximately 0.5 
seconds. This suggests that aircraft instru- 
ments are typically read in a check reading 
manner. 

This paper summarizes a series of experi- 
ments concerning the effect of variation in 
pointer alignment position, in dial diameter 
and in pointer design on check and qualita- 
tive reading of instrument groups. For more 
detailed reports of these experiments, the 
reader is referred to a USAF Memorandum 
Report by Warrick and Grether (6) and two 
USAF Technical Reports by White (7, 8). 


Experiment I: Effect of Pointer Alignment 
Position on Check and Qualitative 
Reading 


A series of studies was carried out to de- 
termine the effect of various pointer-align- 
ment positions on the speed and accuracy of 
check and qualitative reading of a four by 
four arrangement of 16 instruments as shown 
in the center panel of Figure 1. In addition 
to alignment at the 9, 12, 3 and 6 o'clock po- 
sitions, several mixed alignment positions were 
studied. 

Apparatus and Procedure. 


of 16 instruments was presented by 

projected lantern slides. By the perakan ef’: 
shutter device the subject controlled the exposuri 
of the projected instruments. For the check 
reading tests, the subject responded by operatin 
a three-position toggle switch held 4 2 


eld in hi 
He was instructed to move the switch to = 


The simulated Panel 


right if all pointers were aligned, and to the left 
if any pointers were misaligned. Forty college 
students were used as subjects in this experiment. 
A somewhat more complicated response Was 
required in the measurement of qualitative read- 
ing performance. To the right of the subject 
was a group of 16 three-position toggle switches 
arranged in the same manner as the 16 simulate 
instruments. If there was a deviating pointer 
the subject moved the corresponding switch in 3 
direction to indicate an increase or decrease from 
the alignment position. Forty aviation cadets 
and forty experienced instrument pilots were 
used in the experiment. Each of the eighty SU” 
jects served under only one condition of pointer 
alignment. i 
The apparatus and procedure used in studying 
mixed alignment differed from that describe 
above. The stimulus material was an actua 
mock-up of 16 pointers and dials. Only the % 
12 and 3 o'clock alignment positions were studie e 
All pointers in any one row deviated by the san 
amount from the alignment position, the amoun” 
of deviation varying between rows. The m 
posure shutter was operated by the experimenti 
who also set the pointers from the rear of the 
panel. Twelve subjects, staff members 0! f > 
Aero Medical Laboratory, were used in this €% 


periment, each subject serving under all exper 
mental conditions. 


Results 
js 
F The results for the check reading study: 
Table 1, show that the time taken for a 
reading appears to be relatively indepen F 
of the orientation of the pointer. The ave 


Table 1 
Speed and Accuracy of Check Reading a Group 
of 16 Instruments with Homogencous 
Pointer Aligninent 


S Note: N = 40 (10 subjects in each a 
to 
/ Per Ce? 
Pointer Miem Rennes ae nse 
Position Sends in Err 
9 @clock ~~ 2.0 
0.66 £ 
12 o'clock 0.69 25 
3 o'clock 0.68 5.0 
“adek 0.63 3.0 


302 


Instrument Reading III: Check Reading of Instrument Groups 303 


Table 2 


Speed and Accuracy of Check Reading 16 Instruments 
with Pointers Aligned Diagonally Around 
the 9, 12 and 3 o’clock Positions 


n Mean Response Per Cent of 
Pointer Time in Responses 
Position Seconds in Error 
9 o'clock 1.64 18.8 

12 o'clock 1.56 15.6 
3 o'clock 1,59 10.6 


age times required to scan the panel and op- 
erate the lap-held switch were 0.66, 0.69, 0.68 
and 0.63 seconds for the pointers in the 9, 12, 
3 and 6 o'clock positions respectively. Error 
frequency likewise did not discriminate be- 
tween the 9, 12, 3 and 6 o'clock alignment 
Positions. With mixed alignment the response 
times were more than doubled and the fre- 
quency of errors was generally higher as 
Shown in Table 2. 

For qualitative reading, the exposure and 
response time data in Table 3 show that align- 
Ment at the 9 o’clock position resulted in more 
rapid reading of the group of 16 instruments 
than did alignment at the 12, 3 or 6 o’clock 
Positions. A comparison of the error data 
Shows that the readings at the 9 o’clock align- 
Ment position were also more accurate. The 
results are similar for both pilots and cadets. 
Mixed alignment conditions caused a consid- 
erable increase in response time and errors as 


Shown in Table 4. 


Discussion 
pointer align- 


Th jority of 
€ apparent superiority ver the other 


Ment at the 9 o’clock position © 


cardinal positions for qualitative reading may 
have resulted from the relationship between 
the direction of the required switch response 
and the direction of the pointer deviation. 
The instructions to the subject read in part, 
Oe oa ah the reading is too much, flip the 
switch down, . . . if the reading is too little, 
flip the switch up... .” The direction of 
the deviation which corresponds to a judg- 
ment of “too little” or “too much” changes 
with alignment position. Apparently the re- 
lationship inherent in the 9 o’clock position 
was optimal under these instructions. 

The results of the experiment on check 
reading did not indicate the superiority of any 
one of the four alignment positions. They 
did demonstrate, however, that pointer align- 
ment, particularly uniform pointer alignment, 
greatly simplifies the task of reading a group 
of instruments. The response times found in 
this experiment using 16 instruments are ap- 
proximately equal to the time required for 
reading a single instrument in flight. 


Experiment II: Effect of Pointer Design 
on Check Reading 


The purpose of this experiment was to de- 
termine the effect of certain factors in the de- 
sign of pointers on the speed and accuracy of 
check reading a group of 16 instruments with 
horizontal pointer alignment. The superiority 
of pointer alignment for rapid check reading 
apparently results from the simple pattern 
which is disturbed by any deviating pointer. 
Any malfunction which would cause the pointer 
to rotate 180 degrees might not be detected 
because of the minor resultant change in the 
general alignment pattern. Variations in the 


Table 3 


Reading of Six 


Speed and Accuracy of Qualitativ 
ch group). 


Note: N = 48 (12 subjects in ea 


teen Instruments with Homogeneous Pointer Alignment 


Mean Response Time Per Cent of Responses 


Mean Exposure Time in Seconds in Error 
in Seconds ae eee -= 
Pointer Cadets Pilots Cadets Pilots Cadets 

at F e! 

osition Pilots s = E 184 2.91 3.00 

9 o'clock 0.90 1.17 2.02 2.17 6.25 12.08 

12 o'clock 1.15 1.39 2.32 2.34 23.33 20.83 

3 o'clock 1.30 1.83 1.89 2.29 10.83 9.58 
6 o'clock 1.62 1.46 


304 


Table 4 


Speed and Accuracy of Qualitative Reading 16 Instru- 
ments with Pointers Aligned Diagonally Around 
the 9, 12 and 3 o'clock Positions 


Note: N = 12. 
Mean Response Per Cent of’ 
Pointer Time in Responses 
Position Seconds in Error 
9 o'clock 4.51 15.8 
12 o’clock 4.36 22.6 
3 o’clock 4.78 14.9 


design of the base of the experimental point- 
ers in the present study were intended to in- 
crease the change in pattern resulting from a 
180-degree deviation. 


Apparatus and Procedure. The stimulus mate- 
rial and method of presentation and procedures 
were very similar to those used in the preceding 
experiments in this report. The five pointer de- 
signs studied in this experiment are shown in Fig- 
ure 1. All pointers had about the same physical 
dimensions. They were 1%” long and 242'" 


q M 2” wide 
at the center. Five film strips, one for each 


Fic. 1. Instrument panel 


arrangement used in 
designs used in F 


W. J. White, M. J. Warrick, and W. F. Grether 


pointer design, were made of a panel containing 
16 simulated 1%” instruments. 

Forty instrument pilots and an equal number of 
cadets served as subjects; each served under only 
one condition of pointer design. 


Results 


The results in terms of speed and accuracy 
of check reading as a function of pointer de- 
sign are summarized in Table 5. In the first 
two columns of this table are the per cent of 
readings in error, all errors being combine 
for pilots and cadets. In the next two col- 
umns are the per cent of undetected 180- 
degree deviations. For both subject groups 
the number of undetected deviations WaS 
greatest for design D. For the pilot group 
only the largest difference, that between de- 
signs D and E, is significant at the 5% leve 
of confidence. For the cadet group none of 
the differences are significant at the 5% level 
of confidence. This suggests that the experi 
mental pointers (designs A, B, C and 
when rotated 180 degrees from alignment 
sition, were little if any more effective in 


2), 
po- 
pro- 


all check read 


ated ing experiments and pointer 
“xperiment TI, : s and p 


Instrument Reading II: Check Reading of Instrument Groups 


Table 5 


The Effect of Pointer Design on the Speed 


and Accuracy of Check Reading 16 Instruments 


with Homogeneous Pointer Alignment 


Note: N = 90 (10 subjects in each sub-group). 


Per Cent of Readings Per Cent of Undetected Per Cent Misc. Mean Re- 
in Error 180° Deviations Errors sponse Time 
Pointer (N = 200 Trials) (N = 40 Trials) (N = 160 Trials) in Seconds 
esign Pilots Cadets Pilots Cadets Pilots Cadets Pilots Cadets 
$ 14.00 11.00 45.00 37.50 625 4.37 20 2.11 
z 16.50 10.50 47.50 35.00 875 437 162 191 
4 15.50 15.5 47.50 27.50 7.50 12.50 180 2.05 
E 14.50 12.5 62.50 47.50 2.50 4.37 1.75 1.90 
ay 21.50 — 39.00 16.00 = 2.80 = 


ee cues for detection of misalignment than 
the standard pointer (design D). 
a we compare the data for miscellaneous 
tu Ts in Table 5, a somewhat different pic- 
re is obtained. For the pilot group the error 
a reentage was lower for the standard pointer, 
€sign D, than for other designs with the dif- 
aa ates significant at the 5% or greater- level 
d confidence, The results for the cadet group 
© not support this superiority of the design 
last The mean response time values in the 
ca two columns of Table 5 show no signifi- 
nt differences between pointer designs. 


Discussion 
The failure of the experimental pointers to 


ee substantially the 180-degree type error 
“tag perhaps, be attributed to @ configura- 
oO nal change brought about when the pointer 
Verlays the dial numerals and graduations. 


is reduces the visibility of the expansions 


of the pointer base. This is especially likely 
when the panel is scanned rapidly as in check 


reading. 


Experiment III: Ocular Movements in 
Relation to Dial Diameter 


The pioneering work of Paterson and Tinker 
(5) on typographical factors which influence 
reading suggests that there might be an opti- 
mal dial diameter for instrument reading pur- 
The optimal dial diameter should be 
the resulting instrument panel size 
could be scanned accurately, rapidly and with 
a minimum number of eye fixations. The 
most suitable approach to this problem ap- 
peared to be through the combined measure- 
ments of manual responses and of eye move- 
ments while check reading a panel of instru- 
ments. 


Apparatus and Procedure. For this experiment 
three panels were used. The instrument diame- 


poses. 
such that 


Table 6 


Eye Movements, Response Times and Er 


Note: N = 6. 


rors in Check Reading 


a Group of 16 Instruments 


Dial Diameter 


1 inch 14 inches 2 inches 
= Type of Measure 2.8 2.6 
yi 4 2.9 

he number of fixations pet trial 0.29 0.26 0.26 
ean time per fixation i 0.82 0.72 0.75 
Total nas time from preliminary to final Sate pointer 

er cent deviating pointer trials on which devia 72 1 81 

i was fixated 0.76 0.72 0.73 
“esponse time (sec.) 6 4 12 
er cent errors 


306 


PRELIMINARY FIXATION TO THE LEFT 
OF THE TEST MATERIAL 


Ist FIXATION 


2ND FIXATION 
7 


3Ro FIXATION 


W. J. White, M. J. Warrick, and W. F. Grether 


PRELIMINARY FIXATION TO THE RIGHT 
OF THE TEST MATERIAL 


Fic. 2. Distribution of first, second and third eye fixations in check reading a 
panel of 16 instruments. 


ters were 1, 1/4, and 2% inches. In all cases the 
spaces betweep the margins of the instrument 
were W% inch. These panels were check read 
using both eyes. The movements of the right 
eye were recorded by the corneal reflection tech- 
nique. An American Optical Company Ophthal- 
mograph was modified so that exposures could be 
made with either moving or stationary film. The 
moving film exposures provided data on the num- 
ber of fixations, and the lateral position and dura- 
tion of fixations. The stationary film exposures 
provided additional data on the position of fixa- 
tion in both the vertical and lateral dimensions, 
but did not record the duration of fixations. 

On each of 60 test trials the subject was re- 
quired to fixate a spot to the left of a screen 
covering the panel. When the screen was raised 
the subject scanned the panel as needed and then 
fixated on a spot to the right of the screen. He 
then signalled his judgrsent as to whether or not 
a pointer was deviated from the aligned position. 
For three of the six subjects the initial and final 
fixation points were reversed. When the film 
record of the subject’s eye movements was pro- 
jected against a replica of the corresponding in- 
strument panel, the initial and final fixation Points 
provided references for the analysis of the film 
records. 


Results 


The major results obtained from the re- 
cording of eye movements are summarized in 
Table 6, which gives check reading time, re- 
sponse time and error data in addition to the 
various measures of visual performance. Al- 
though not statistically significant, the re- 
sults are quite consistent in showing slightly 
superior performance with the 134-inch dial 
size. 

Analysis of the location of the 1st, 2nd and 
3rd fixations, in relation, to the panel arrange- 


ment, are shown in Figure 2. The major pro- 
portion of the first fixations were in the uppe" 
left quadrant of the instrument panel. This 
was true even when the preparatory fixation 
was at the right of the panel, Surprisingly 
few fixations were made on the lower tw? 
rows of dials, and those that did fall in this 
area Were usually the last fixation in the sca?” 
ning sequence. Analysis of the errors oe 
mitted shows that 22 per cent more erro" 
were made when the deviating pointer was A 
the lower half of the panel. In the fourth 
row of Table 6 is shown the per cent of dev! 
ating pointer trials in which the deviating n 
strument was fixated. 


Conclusions 


The following conclusions are suggested 4 
the results of the foregoing experiments: 

1. The pointer alignment position (9 i 
3 or 6 o'clock) has little effect on simple che 
reading involving the mere detection ° 
pointer deviation. inter 

2. Uniform horizontal or vertical poin“ 
alignment facilitates instrument -check "°® 
ing. Fe 
_3: Pointer alignment at the 9 o'clock P 
sition is optimum for qualitative reading 
defined by the perceptual judgments 
manual responses required in these XP 
ments. p 


4. Pointer modifications of the type e 
ployed in the present experiment do not Ity 
complish a satisfactory reduction in diffi” 
of detecting 180-degree deviation errors: 


eri- 


Instrument Reading II]: Check 


5. A 14-inch instrument dial appears su- 
perior to either a larger or smaller dial in 
terms of the number and duration of eye fixa- 
tions while check reading. 


Received August 25, 1952. 


References 


1. Fitts, P. M. Engineering psychology and equip- 
ment design. In S. S. Stevents (Ed.), Hand- 
book of experimental psychology. New York: 
John Wiley & Sons, 1951, 1287-1340. 

2. Grether, W. F. Instrument reading. I. The de- 
sign of long-scale indicators for speed and ac- 
curacy of quantitative readings. J. appl. Psy- 
chol., 1949, 33, 363-372. 

3. Kappauf, W. E. Studies pertaining to the design 
of visual displays for aircraft instruments, 
computers, maps, charts, tables and graphs: 
A review of the literature. Engineering Divi- 
sion, Air Materiel Command, Dayton, Ohio, 
USAF Technical Report No. 5765, April, 1949. 

4. Milton, J. L., Jones, R. E: and Fitts, P. M. Eye 
fixations of aircraft pilots, II. Frequency, dura- 


` 


Reading oj Instrument Groups 30 


tion and sequence of fixations when flying the 
USAF instrument low approach system. En- 
gineering Division, Air Materiel Command, 
Dayton, Ohio, USAF Technical Report No. 
5839, October, 1949. 

5. Paterson, D. G. and Tinker, M. A. How to make 
type readable. New York: Harper & Bros.. 
1940 (out of print). 

6. Warrick, M. J. and Grether, W. F. Effect of 
pointer alignment on check reading of engine 
instrument panels, Engineering Division, Air 
Materiel Command, Dayton, Ohio, USAF 
Memorandum Report No. MCREXD-694-17, 
June, 1948. 

7. White, W. J. Effect of dial diameter on ocular 
movements and speed and accuracy of check 
reading groups of simulated engine instru- 
ments. Engineering Division, Air Materiel 
Command, Dayton, Ohio, USAF Technical 
Report No. 5826, June, 1949. 

8. White, W. J. Effect of pointer design and pointer 
alignment on speed and accuracy of reading 
groups of simulated engine instruments. En- 
gineering Division, Air Materiel Command, 
Dayton, Ohio, USAF Technical Report No. 
6014, July, 1950. 


THE JOURNAL OF APPLIED PSYCHOLOGY 
Vol. 37, No. 4, 1953 


Dimensional Analysis of Motion: VI. The Component 
Movements of Assembly Motions * 


Robert Smader and Karl U. Smith 


The University of Wisconsin 


The application of electronic methods to in- 
vestigation of human motions (1, 4, 5) pro- 
vides the means of systematic experimental 
study of long-standing problems of motor co- 
ordination. Such electronic techniques, which 
have been used previously in studying panel- 
control operations and closely related per- 
formances, are used in this study to analyze 
the component movements in assembly mo- 
tions. Data are presented bearing upon the 
relative efficiency of the component move- 
ments of assembly motions and upon the ef- 
fects of practice on such movements. 

Two methodological principles are demanded 
for the scientific investigation of complex hu- 
man motions. First, the basic dimensions of 
motion must be subject to control and quanti- 
tative specification. Second, the component 
movements of the pattern must be segregated 
and measured separately. This second re- 
quirement for motion study must be achieved 
with an economy sufficient to permit replica- 
tion of the observation of the same motion 
pattern throughout the course of an experi- 
‘ment. 

Control and specification of the various di- 
mensions of movement, especially the bodily 
and space dimensions of movement, are se- 
cured through preplanning of performance 
situations in order to permit quantitative 
variation in the direction, plane, magnitude 
bodily orientation, and other aspects of mo- 
tion. Measurement of the component move- 
ments in assembly motions may be achieved 
by means of electronic motion analysis tech- 
niques, to be described, which provide, for the 
first time, methods adequate for the experi- 


mental analysis of the separate movements of 
skilled performances. 


1 This research has been su 
by the Legislature of the Stat 
signed by the Graduate Sch 
the University of Wisconsin 


Pported by funds vot, 
te of Wisconsin, and os 
00] Research Committee 


308 


Methods 


Figure 1A illustrates the main parts of the the 
paratus used for electronic registration of i 
duration of the component movements in gam 
bly motions. The task performed by the subject 
in this situation, as illustrated by Figure 1B, “a) 
sists of four major movement components: ne 
an initial reaching or travel movement that eas 
ries the hand and arm to the bins containing ae 
used as assembly objects; (b) a grasping MO or 
ment in securing one of the pins; (c) a return 
loaded travel movement that carries the pY no- 
the assembly plate; and (d) a positioning z 
tion of inserting the pin in the assembly plste 

The preplanned performance situation oe 
here is designed to provide for systematic x ment 
tion in different space dimensions of compo! i 
movements of the assembly task (Figure ar- 
Assembly bins and the assembly plate a'e $p, 
ranged generally on a work table measuring h are 
by 20 in. The metallic assembly bins, whic ay i 
fitted with metal bases, are 8 in, long, 4 19 n is 
and 4 in. high. The floor or base of each G ted 
curved to facilitate grasping of the pins = g of 
inside. Six such bins may be used for pa ar- 
collars of different sizes. The distance are ae? 
rangement of these bins with respect to te ni 
sembly plate may be changed to vary the mravel 
tude, direction and plane of component 
and grasping movements. jnu™ 
; The assembly plate (Figure 1B) is of alur illed 

4 in. thick, 104 in, long and 1044 in. wide, 0" 


with 16 large apertures, 117i cm. 1n ion are 
Fitted into these apertures are discs which * 


drilled with three holes of different size®, ged 
“A in., % in. and % in, The discs may be "ply 
into three different positions on the 45 ct one 
plate. Three sizes of pins, all approximate Y jzes 
inch long but of diameters to fit the thre? jons 
of holes, are provided in three bins. Va P jacer 
m size of object grasped, serial order of ie di- 
ment of the pins, and other variations in © ye 
mensional aspects of the assembly placing © jace 
ment are made by means of the different jes 2” 
ments of the bins and positions of the jo 

the assembly plate, ion Ê 

rec = Subject’s movements in this situat an 
re orded by means of an electronic mot! ast 
yzer that automatically segregates an mera 
Separately in 0.01 seconds the duration Oe the 
fa the four basic component movements xs M 
ask. Figure 1A illustrates how the ¢lo¢ spe’ 
Circuits of the analyzer are arranged wit 


Dimensional Analysis of Motion: VI 309 


ELECTRIG CLOCKS 


PIN SUPPLY BINS 


‘ASSEMBLY MOTION COMPONENT 
ANALYZER 


T LAY UNIT 
O RE 


i nalyzer circuits, the time clocks, 
ue aie? nase OT ome en cued bottoms in order to 
edt face detail of this same setup. The letters 
t hand and right hand respectively. The 
dicated by number. The size of pins lo- 
position of the bin, 


Fic. 1, A diagrams t bins 
the assembly plate, and the B shows the sur 
facilitate removal of the pins: fer to lef 
“L.H.” and “R.H.” in t i Pe hand is in 
sequence of movemen to the 


cated in the bins is in 


dicate 


310 


to the work area. The basic principle of opera- 
tion of the electronic analyzer is that the subject 
acts as a key in the circuit and thus sequentially 
activates different relays and clocks * during the 
different stages of the assembly motion. The 
current level used in the subject’s side of the cir- 
cuit is below cutaneous threshold level, so that 
the operator has no sensory appreciation of the 
fact that he is serving as a conductor. 

Figure 1B shows how the device performs in 
actual operation. Upon starting a cycle of as- 
sembly motion, the subject initially touches the 
assembly plate. This contact with the plate starts 
the first of the four precision time-clocks, the 
initial travel time-clock. Upon coming into con- 
tact with the pins in one of the assembly bins, 
this first clock stops and the second. the grasping 
time-clock, starts. When the contact between 
the pin and the bin is broken, the grasping time- 
clock ceases to run and the loaded-travel time- 
clock starts to run. This third clock stops when 
the pin is brought into contact with the assembly 
plate, and the fourth clock, the assembly posi- 
tioning time-clock, starts to run. This fourth 


2 Model S-1, Standard Electric Time C any, 
Springfield, Mass, ompany, 


SELUTSTLITE 
secs, 


Position 
ne osis 
© ote 


Fic. 2. 


Robert Smader and Karl U. Smith 


clock in turn stops running when electrical con- 
tact with the assembly plate is broken and a 
second cycle of motion is initiated. The differ- 
ent times registered on cach clock may be read 
after one cycle of motion or they may be al- 
lowed to summate over several cycles and then 
be read. , 

In the experiment to be described, 46 right- 
handed college students were used as subjects. 
These subjects were given standard instructions 
to fill the assembly plate with pins according t0 
a defined sequence, which was kept constant for 
all subjects. One complete filling of the plate 
constituted a trial, Subjects performed two trials 
per day for each of three consecutive days. h 
scores analyzed were the mean durations of ea¢ 
component movement for each day. 4 i 

In addition to the running of the tèst trials, l 
just described, a calibration trial was run befor 
the first test trial of the second and third a 
In these calibration trials, the two manipula 
time clocks and the two travel time clocks wer? 
electrically connected so that the discrepancy be 
tween two clocks in measuring the same elaps° 
time could be determined. 


LoaDeD TRAVEL 


= ohy 
= 0.06. 


NON LOADED TRAVEL 
N e 0.407 
o= om 


tHPaGidbeads 
Secs. 


Dimensional Analysis of Motion: VI 


| O——0 Grasp 
7 O— Position 
Aes O-— Loaded Travel 
A~ Unloaded Travel 
3 

a 
a] 
g 
S 
v 
© 
ges Slee Seo 
£ 
g 
a 
BS 
3 
& 
F 20 
e 
5 
© 

A = 


a as a function of practice. 
An that observed in the grasping aN 


Results 


s The results of this experiment will be pre- 
ented with respect to the following mam 
a (1) distribution of time values for 
erent component movements in the assem- 
Y task; (2) the effects of practice on dif- 
me component movements in the ire 
ta as on their interrelation; and (3) 
lability of measurements of different mov 
ent components. 
igure 2 shows the distribution of ati’ 
ent times of the four component movemen 


~ the as third day of prac- 
tice sembly task on the ; : ie figure repre- 
idual movements 
task. The 


Sent he data presented in 
ôt the durations of indiv 
5 €ach part of the assembly hortest 
fine = nt component giving T : The as- 
2 ee Boi i ives the long- 
ent components 
lightly skewed 


Y positioning movemen 
Urations, All four movem 


j 
Sor a: 
Do W distributions that are S 
Sitively, 


15: 
1 2 3 
Day 
$ he fi diff 
Fig a , ET. variability (B) of the four different component move- 
IG. 3. The change in the duration (A) and in the v iy et ements show. less hanss. With practice 


311 


O—O Grasp 

O— Position 

O—4 Loaded Travel 
4—A unloaded Travel 


> 
ù 8 


> 
o 


“u 
o 


Standard Deviation of Time/Trial (In seconds) 
v ow 
a 


Generally, the two trave: 
d positioning movements. 


Figures 3A and 3B describe the change in 
the duration of the component movements of 
the assembly task as a function of practice. 
Figure 3A gives the learning curves based on 
the mean duration of each of the component 
movements on each day. Figure 3B shows 
the change in variability of each component 
movement as a function of practice. 

Three of the four movement components in 
the task show a highly significant change in 
duration as a result of practice over three 
days. The greatest change, about 18 per 
cent, is found for the grasping component of 
the task. The other manipulative component 
of the task, the positioning movement, gives 
an 11 per cent change with practice. Both 
of these manipulations show a greater change 
with practice than does either of the travel 
movements. One of the latter, the unloaded 
travel movement, shows a statistically signifi- 
cant change with practice only at the 0.05 
point. Tests of significance of the differences 


312 


between day 1 and day 3, based on “t,” indi- 
cate that changes occurring in learning for 
the three other components of movement are 
significant at the 0.001 point. On each day 
of practice, the differences between the mean 
durations of the two manipulative compo- 
nents, grasping and positioning, and between 
the two travel movements are statistically 
significant at the 0.001 level. 

The variability of different component 
movements does not change uniformly as a 
function of practice. The variability of the 
unloaded travel movement decreases with 
practice, but not significantly. In contrast, 
the variability of the grasping motion is 
reduced sharply with practice. The other 
manipulative movement, positioning, has its 


Robert Smader and Karl U. Smith 


standard deviation changed about 15 per cent 
with practice. The variability of the loaded 
travel movement is hardly changed at all 
during the three days. 

The present method of isolating elector 
ically the different components of movemen 
in assembly has provided the means of study- 
ing the interrelation between separate move- 
ments as a function of practice. In order 7 
conduct such investigations, the individua 
mean scores for each component of motion 
are obtained for each day of practice and cor” 
related. Thus, six correlations between last 
of the four component scores are obtained f0 
each day of the study. r are 

The correlation values just described <a 
plotted in Figure 4 as a function of days 


CORRELATION CORRELATION 
v 
9 
8 5 
gohei 
aot Wasco maven nomoan? 
70) orn Travel 


memme anant -Loacen raava 
-07 
an 


-” 


(77 _MOMMOADED rnavet -anase 


mo 
S 
Toso Taave -rosi rion 


P 


POSITION = monLoaDeD 
avaL 


relation between 
T pairs of move 


the left gives the curves fo 
h X ments 
The plot to the right gives the curves for pairs aR 
correlation values at the 05 and o1 point pe 
curves, only those for the correlat; es 


1 ions between ] 
loaded travel are statistically significant for a 
iz Ee 


different com; 


movements t 
indicated by 
and nonloaded 


DAYS Lt 

Or p 

ponent movements during practice. The patter 

woven litcent to one another in the motion Je- 2 

hat are non-adjacent in the Y° iffet y 
the dotted lines, Among the n 


uiing B 
travel and between positioning 


days of practice, 


Dimensional Analysis of Motion: VI 


Table 1 


Coefficients of Stability of the Component 
Movement Times 


Day 2 Day 


| Day 1 vs. 
| Test-retest Test-retest 


Type of Reliability Reliability 

Omponent Component Coefficient Coefficient 
Manipulative Position +0.79 +0.86 
“Grasp +0.79  +0.91 
Travel Unloaded +0.83 +0.88 
Loaded +0.80 +0.91 

practice, In addition, the broken lines on 


Figure 4 indicate the 5% and 1% levels at 
> vhich these correlations may be considered to 
(differ Significantly from zero. The correlation 
ae which give values consistently sig- 
nificant at the 5% level are those represent- 
ne the relations between loaded and unloaded 
sh i Movements and between position and 
‘nloaded travel movements. Although it 
i hight be expected that the position and grasp 
movements would correlate highly with one 
eae these two movements in fact show 
no Significant relation with one another. 
is ig main point to be observed in Figure 4 
€ change in the correlation between com- 
es movements in relation to practice ie 
Corre Tt will be observed that none of the 
men ations between different pairs of hove 
da: PSS significantly altered during the i re 
ea of practice. In other words, the inter 
mee of movement components does aol 
a during learning and improvement in per 
ion ance in the over-all task. If the one 
of n between component movements 1S thagh 
Nove a measure of integration between Naot 
ing pone then the results show that 
of has little or no effect on the integra 
Separate movements in assembly motions. 


nasmuch perimental tech- 
i ast resent ex 
niqu nep r rocedures for 


zuto it is of 
poral consist- 


enaa scores obtained. 
On e that each subject perfo 
thoy i of the three days % © 
Our scores, representing 


Do : é 
fant Movements, were obtained on €a 


313 


trial. To obtain correlation coefficients be- 
tween days for each of the four component 
movements in the task, a mean score for each 
subject for each type of movement was ob- 
tained and these values for different days 
were then correlated with one another. Ta- 
ble 1 summarizes the values obtained. The 
table shows that a high degree of consistency . 
is found for the component movement times 
in the assembly task. It should be noted 
again that the figures given in Table 1 are 
based upon only two trials of performance 
per day and that the figures represent consist- 
ency between days. These figures in Table 1 
show that the low intercorrelations between 
component movements, as given in Figure 4, 
cannot be accounted for in terms of unre- 
liability of the individual movement times. 


Summary 


This study describes electronic techniques 
which make possible detailed experimental 
study of the component movements in assem- 
bly motions. Besides indicating the nature 
and value of these methods for analysis of in- 
dustrial motions, the present paper describes 
experimental results bearing upon the char- 
acteristics of component movements in an as- 
sembly task, and upon the effects of practice 
on the duration and relation between these 
movements. 

By means of the electronic techniques de- 
scribed, the component movements of travel, 
grasp, loaded travel and positioning in a uni- 
manual assembly task may be isolated and 
their durations measured in hundredths of 
seconds. Distributions of the component 
movement times for forty-six college student 
operators are approximately normal. 

Practice does not affect uniformly different 
types of movement in the assembly task. The 
efficiency of the two manipulative movements, 
positioning and grasping, are changed the 
most by practice. In contrast, the travel mo- 
tions in the task, especially the unloaded 
travel component, show very little change 
with practice. Similar results are found for 
measures of variability of response during 
practice. . 

Tt has been observed also that practice does 


not alter significantly the correlation between 


314 


component movements in the assembly task. 
Early in learning the correlations between the 
different component movements of manipula- 
tion and travel are near zero. As practice 
continues, the values of these correlations do 
not change significantly. At the start of 
learning, a correlation of about +0.07 is 
found between the two travel movements in 
the task. With practice, this correlation in- 
creases slightly but not significantly. 

The results on the effects of practice, as 
just described, are equivalent to experimental 
findings obtained in previous studies of learn- 
ing and component movements in tracking be- 
havior (2, 3) and panel control motions (1, 
4, 5, 6). In all of these tasks, the human 
body seems to act like a many-channeled sys- 
tem in the production of discrete unrelated 
movements for a given task, the integration 
of which is not altered materially by con- 
tinued practice. The findings described here, 
along with similar results found for other mo- 
tion patterns, point toward the lack of signifi- 
cance of learning concepts in understanding 
the details of movement coordination in skill, 
The same results, however, point up the great 
significance of component and dimensional 
motion analysis in dealing practically and 
theoretically with problems of skilled motion 
and the instrumental relations of such motion. 


Robert Smader and Karl U. Smith 


The electronic methods of motion analysis 
described in this report provide, for the first 
time, economical methods of obtaining reliable 
measures of movement components in assem- 
bly skills. These methods, along with pro- 
cedures of preplanning and quantitative COn- 
trol of the dimensions of motion in work, lay 
the foundation of scientific study of motion M 
terms of modern experimental designs. 


Received October 6, 1952. 


References 


1. Davis, R., Wehrkamp, R., and Smith, K. U. Pi 
mensional analysis of motion: I. Effects l 
laterality and movement direction. J. oP?" 
Psychol., 1951, 35, 363-366. W. 

2. Lincoln, R. S., Simon, J., and DeCrow, Bs nt 
Effects of practice upon different compen s 
movements in tracking, Per. and Mot. Ski 
Res. Exch. (In press). i 

3. Lincoln, R. S., and Seth, K. U. systemattt 
analysis of factors determining accuracy | 
visual tracking. Science, 1952, 116, 183-18 tez 

4. Rubin, G., and Smith, K, U. Learning and er 

gration of component movements in & pat! 

of motion. J. exper. Psychol. (In press): mh 
- Smith, K. U. and Wehrkamp, R. A panes, 

motion analyzer applied to psychomoto! i 

formance. Science, 1951, 113, 242-244. nal 

- Wehrkamp, R. and Smith, K. U. Dimens se 

analysis of motion: II. Travel distance f° 
J. appl. Psychol., 1952, 36, 201-206. 


n 


or APPLIED PsycHoLocy 
4, 1953 


The F minus K Index on the MMPI 


Angus G. MacLean, Arthur T. 
California Test Bur: 


Hunt (2) compared F-K raw scores for 
honest and dissembled MMPI profiles from 
the same subjects and found that an F-K 
cutting score of — 11 and below was fairly 
effective in spotting fake-good cases, but also 
Picked up too many supposedly honest cases. 
Gough (1) reported the distribution of F-K 
Scores for a group of 691 adult normals, the 
middle 80% of whom obtained approximately 
Normally distributed scores ranging from — 22 
to +11 with a mean of about — 9. 

The present data consist of a sample of 100 
Candidates for nursing randomly drawn from 
a large number of similar cases in the Los 

ngeles and San Francisco Metropolitan 
areas, These candidates are almost without 
exception female, are finishing the 12th grade 
Or are in their first year of college; if not at- 
tending college, they range from 18 to 22 
Years of age with a modal age of 18 to 19. 
These cases provide data worth noting since 
they are in a selection situation; furthermore, 
the balance of their MMPI profiles suggests 
that they are motivated to present themselves 
in the best light. Few T-scores exceed 60 on 
re clinical scale and virtually none exceed 


subjects 
bution 
Be 17 
10 


The F, K, and L scores for these s 
are distributed as follows: Actual distri 
Of F scores: T = 50: 64 cases; T= 5 
Cases: T = 55-59: 8 cases; T = 60-64: 
Cases; and T = 73: 1 case. 
b A mean and standard deviation should not 

© calculated for such a distribution. 
of The distribution of K scores had a "The 
t 57.9 and a standard deviation of iddl 
age: Tange was from 30 to 79, and the midde 

o of cases lay between 46 and 69. aia 
Sta he L scale had a mean of 72.0 an e 
i ndard deviation of 8.1. The total ig 
vas from 50 to 84, and the middle 8076 © 
Cases lay between 61 and 82. The scores were 


siettly negatively skewed, and te Po 
le that these candidates, while defensive 
the frankness 0n the 


F and K items, felt that 315 


Tait, and Calvin D. Catterall 


eau, Los Angeles 


L items would not be damaging, or even 
that they might be “catch” items. 

The distribution of raw scores, somewhat 
smoothed, is shown in Table 1. 

It is suggested that a cutting score of — 17 
might be tentatively adopted: this would in- 
clude 31% of the nurse candidates but only 
about 12% of the “Adult Normal” cases dis- 
cussed by Gough. A cutting score of — 14 
would include 50% of nurse candidates and 


Table 1 


F-K Raw Score Distribution on MMPI Based on a’ 
Sample of 100 Candidates for Nursing 


Sten Score Raw Score % in 
(1 sten= (Lower Category Cumulative 
0.50) Limits) (Smoothed) % 
9 —25 up 2.3 2.3 
8 —22 44 6.7 
7 —20 9.2 15.9 
ó -17 15.0 30.9 
5 —14 19.1 50.0 
4 —1l 19.1 69.1 
3 -9 15.0 84.1 
2 -7 9,2 93.3 
1 -2 4.4 97.7 
0 —1, 0, or 2.3 100.0 
positive 


about 22% of Gough’s “Adult Normals.” It 
may be assumed, lacking information to the 
contrary, that the “Adult Normals” consisted 
merely of non-pathological cases, and included 
quite a wide range of cases from the moder- 
ately self-critical and fairly honest to the 
highly ego-defensive. About 1% of Gough’s 
cases obtained F-K scores of — 22 or more. 
Thus, if a cut-off is established at — 17, about 
12 or 15% of Gough’s cases will be included, 
it is true, but it may well be supposed that 
these cases are not so much unusually well- 
adjusted, as those with a tendency to dis- 
semble even in a non-selection situation. 

In view of the above data, the following 
ranges for the F-K index are suggested: 


316 


1. Positive Raw Score, especially if greater 
than + 2: malingerers or very self-criti- 
cal and unusually honest subjects. 

2. Raw Score from zero to — 10: Normal 

Area. 

. Raw Score from — 11 to — 16: Doubt- 

ful Area. 

4, Negative Raw Score greater than — 17: 
Probable “Fake-good” Area, especially 


when the F-K difference lies in the 
twenties. 


w 


Optimal cut, as always, refers to a specified 
pair of populations, and caution must be used 
in generalizing to dissimilar groups and situa- 
tions. For example, in another study we 
found the F-K raw-score difference to range 
from — 25 to + 30, witha middle-80% range 
from — 16 to +9, a mean of —5 and a 
median of — 6. The SD for this group is 


Angus G. MacLean, Arthur T. 


Tait, and Calvin D. Catterall 


approximately 10 raw-score points. We can 
only assume that some members of this group 
desired to appear in a good light, while others 
were either malingering or unusually self- 
critical. This would suggest regarding small 
positive scores as indicative of frankness and 
insight, while the suspicion of malingering 
would pertain to positive scores in the neigh- 
borhood of + 15 or more. 


Received February 11, 1953. 
Early publication. 


References 


1. Gough, H. G. The F minus K dissimulation inde* 
for the Minnesota Multiphasic Personality In- 
ventory. J. consult. Psychol, 1950, 14, 408- 
413. j 

2. Hunt, H. F. The effect of deliberate deceptio” 
on Minnesota Multiphasic Personality Tawo 
tory performance. J. consult. Psychols 1948, 
12, 396-402. 


> 


THE JOURNAL OF Ai 
A APPLIE SY y 
Vol. 37, No, 4, 1953 ED PsycHoLocy 


Psychological and Personal History Data Related to 
Accident Records of Commercial Truck Drivers * 


James W. Parker, Jr. 
Tufts College 


; This study is one phase of a larger over-all 
research project being carried out in the De- 
Kemi of Psychology at North Carolina 
ei College, Raleigh. The general purpose 
ets improve the method of selecting truck 
; ers to be employed by a large East Coast 
rucking concern. 
i Criterion data for studies such as this have 
ong been a subject of controversy. Up to 
the present time the data used have been the 
cent records. The manner in which these 
data have been used has been varied, but no 
really “new” idea in criterion data has been 
devised. Tt is basically the accident record, 
therefore, which was used in this study, and 
the records for the various subjects were 
equated on the basis of the number of miles 
driven by each. 
‘The subjects were 104 drivers who were 
still employed by the company on a certain 
cut-off date, and who had been trained at the 
Sa Training School operated by the 
tg Carolina State College. All subjects 
ad been tested at the training school by 
an examiner trained in psychological testing, 
and under standardized conditions. In most 
Cases the test results were not used in the 
employment procedure by the company. 
Some subjects did not have @ complete re- 
Port of their test performance in their files, 
and this necessitated using varying sample 
Sizes for the different variables. 
The factors or variables with which the 
accident records are compared are divided 
into two categories: (a) psychological test 
— and (b) personal history data which 
Vere derived from the application blank filled 
Sut by the employee at the time of his em- 


jon of a thesis submitted 
te College of Agriculture 
University of North Caro- 
of the require- 

f Science in In- 
hes to express 


ffie under whose 


* . 
t Pa study is a cond 
e North Carolina 
Engineering of the 
ment Raleigh, in partial fulfillment 
lasts for the degree of Master of, 
Dp: tial Psychology. The author Wis 
iren sation to Dr. Dannie J. Mo 

ction this study evolved. 


ployment by the company. The tests used 
were some of the better-known paper-pencil 
tests of intelligence, mechanical comprehen- 
sion, personality, and vocational interest. The 
results of visual screening with the Bausch 
and Lomb Ortho-Rater were also included. 
Personal history data included such things as 
age, education, marital status, blood pres- 
sure, etc. 

Two sources were used to arrive at the cri- 
terion data: the number of accidents a man 
had incurred during a certain period of time, 
and the total number of actual miles driven 


during this same period of time. The acci- 
dent rate was then put on the basis of the 
This 


number of accidents per 5,000 miles. 
was done for two types of accidents, those 
classified preventable and non-preventable by 
the safety department of the trucking com- 
pany. 
Procedure 

The distributions of scores for the entire 
sample of 104 drivers on the psychological 
test data and the personal history data were 
divided into two groups, accident group and 
non-accident group, with respect to each of 
the two criteria, preventable and non-pre- 
ventable accidents per 5,000 miles. The acci- 
dent group for each of the criteria was fur- 
ther divided into upper and lower halves, ex- 
cluding the accident-free group. Figures 1 
and 2 show the distributions of preventable 


and non-pre 


ventable accidents. 


NO. OF DRIVERS 
v 
° 


“p30 060.090 120 150 180 210 240 .270 300 330 360 390 


PREVENTABLE ACCIDENTS PER 5000 MILES 


Fic. 1. Distribution of preventable accidents. 


60 
4 50 
2 
5 
2 40 
A 
6 
30 
m 
5 
ö 20 
Zz 
10 
o 
000 030 060 090 120 150 .180 210 .240 270 300 330 360 
NON-PREVENTABLE ACCIDENTS PER 5000 MILES 
Fic. 2. Distribution of non-preventable accidents 


“Student’s” ¢-ratio (2) was run between 
the means of the groups for each of the vari- 
ables as follows: between the accident and 
non-accident groups, and between the upper 
and lower halves of the accident group. 

The F-ratio test of differences was run be- 
tween the variances of the groups mentioned 
above, and in like manner. This was done in 
order to determine if the distributions being 
compared by means of the ż-ratio had signifi- 
cantly different dispersion within the distribu- 
tions. In any analysis where the F-ratio was 
significant at the 5 per cent level of confi- 


James W. Parker, Jr. 


dence or better, and the significance level -of 
the é-ratio was critical, the significance levels 
of the ¢-ratios were corrected for the differ- 
ence in variance (1). 

On the basis of the analysis for preventable 
accidents per 5,000 miles between the acci- 
dent and non-accident groups, those six vari- 
ables having the most significant f-ratio were 
chosen to be analyzed by the Wherry-D00- 
little Shrunken Multiple Correlation Tech- 
nique (3). This type of analysis gives the 
maximum coefficient of correlation betwee? 
the test variables and the criterion score after 
a correction has been made for the chance 
error added by each variable. The first step 
in this technique is the computation of inter 
correlations among the variables included a” 
the criterion (in this case, number of pre 
ventable accidents per 5,000 miles). 


Results 
Tables 1 and 2 show the condensed result 


of the study (only those variables showi"é, 


significance out of the total of 43 include 


Table 1 


Variables Showing Significant Differences Bety 
and Lower Half, Accid 


veen Upper Half, Accident Group 
ent Group 


Variable 


t Difference in favor ot 


Preventable accidents 


Oe aoe 


Systolic Blood Pressure 


2.397* 


Far Acuity, Right Eye 2.448* pane on 
Far Acuity, Left Eye 2.231* abe } T 
Near Acuity, Left Eye 2411* Lower an 
Kuder Mechanical Interest 2.268* Lower oF 
MMPI Hypomania 2.226* Lower ri 
Far Vertical Phoria 2.802** Lower ha 
Near Acuity, Both Eyes 2-146 Lower half 
Kuder Artistic Interest 10. 3324+ Lower half 
Kuder Literary Interest 5.181 Ges Upper half 
l Upper half 
Non-preventable accidents 
Marital Status j 
Far Acuity, Both Eyes s Lower half 
Near Acuity, Both'Eyes om Lower half 
Near Acuity, Right Eye som Lower half 
samy Lower half 
* = P < 0.05. 
+*+ = P < 0.01. 
"r= P < 0,001. 


Data Related to Accident Records of Truck Drivers 


Table 2 


Variables Showing Significant Differences Bet 


ween Accident Group and Non-accident Group 


Variable 


t Difference in favor of 


Preventable accidents 


MMPI Hypochondriasis 2.011* Non-accident group 
MMPI Masculinity 2.136* Non-accident group 
Kuder Literary Interest 2.358* Accident group 
MMPI Psychasthenia 2.649%" Non-accident group 
3.404*** Accident group 


Kuder Artistic Interest 


Non-preventable accidents 


Non-accident group 


Marital Status 2.413* i 
Depth Perception 3.040** Non-accident group 
*e¢ = P < 0,001. 


“a Big 0105. ** = P < 0.01. 
A the study are mentioned for the sake of 
tevity). In the tables, “difference in favor 
of” refers to which half of the sample dis- 
tribution has the higher mean score. With 
respect to Marital Status and Systolic Blood 
ressure, higher score means that the driver is 
married and has higher systolic blood pres- 
Sure than the mean for the sample. 
S As a result of the Wherry-Doolittle 
Shrunken Multiple Correlation Analysis, it 
Was found that only four of the six variables 
could be used to obtain the maximum multi- 
pIe Coefficient of correlation; these four were: 
rad Literary Interest, MMPI Hypochon- 
asis, Masculinity, and Schizophrema. e 
resulting multiple coefficient of correlation 
Was 0.36, This coefficient is statistically sig- 
nificant at the 1 per cent level of confidence. 


he resulting prediction equation is: 


X. = 0.00234834X as — 0.00376978% 2 
910X31 


~0.0 Xa, + 0.00271 
0183060X 39 + p 0.14349491. 


In this equation the symbols have the fol- 


Owing meaning: 
Xe = the predicted criterion score, : 
of preventable accidents per >» 
miles N 
Xas = the MMPI Hypochondriasis score 
42 = the MMPI Schizophrenia score 
39 = the MMPI Masculinity score a 
31 = the Kuder Literary Interest Sc 


number 


The standard error of a predicted criterion 
score is 0.1007. 


Discussion 


The analysis indicates that the follow- 
ables—Systolic Blood Pressure; Far 
Acuity, Right Eye; Far Acuity, Left Eye; 
Kuder Mechanical Interest; MMPI Hypo- 
mania; Far Vertical Phoria; Near Acuity, 
Both Eyes; Kuder Artistic Interest; and 
Kuder Literary Interest—are related to pre- 
ventable accidents. That is, those drivers 
having high scores on all these variables, with 
the exception of Kuder Artistic and Literary 
Interests, tend to have a lower number of 
preventable accidents. Those drivers having 
high scores in Kuder Artistic and Literary 
Interests tend to have a higher number of 
reventable accidents. 

In looking at the analysis of the non-pre- 
ventable accident data, we see that the psy- 
chological test data variables have dropped 
out, and that only the following appear to be 
related to non-preventable accidents—Marital 
Status; Far Acuity, Both Eyes; Near Acuity, 
Both Eyes; Near Acuity, Right Eye; and 
Depth Perception. These all bear the same 
relationship to the non-preventable accident 
data; that is, the higher the score, the fewer 
the number of non-preventable accidents. 

It might be possible to postulate from the 
results of this study that one of the main dis- 
tinctions between preventable and non-pre- 


ing vari 


320 James W 
ventable accidents is the fact that preventable 
accidents seem to be related to psychological 
test data. Similarly, it might be said that 
such sensory capacities as the visual skills 
and certain personal history data such as 
Marital Status seem to be related to non- 
preventable accidents. 

On the basis of the Shrunken Multiple Cor- 
relation Analysis, it was shown that by using 
the four variables—Kuder Literary Interest, 
MMPI Hypochondriasis, Masculinity, and 
Schizophrenia—a reasonably satisfactory pre- 
diction of the number of preventable accidents 
per 5,000 miles could be made. This part of 
the study seems to be quite significant. It 
would suggest that the psychological test 
variables, particularly those dealing with per- 
sonality traits, are the variables which con- 
tribute to the discovery of those drivers hav- 
ing a high preventable accident rate. It 
might be said that this would seem to bear 
out the results of the first part of the study. 

It should be mentioned that considerable 
selection has already taken place before these 
particular drivers are hired by the company. 
The mere fact that they attended the North 
Carolina State College Driver Training School 
makes them a select or special population, 
and therefore unlike what might be called 
representative of the general population of 
commercial truck drivers. We do not have 
any “extreme” cases in the sample, and it is 
these who seem to contribute a great deal to 
any analysis, especially one in which an at- 
tempt is being made to isolate variables which 
are contributing to job success, 

It must also be noted that the apparent sig- 
nificance of the ¢-ratios reported is inflated. 
This is due to the fact that those reported are 
selected from a total of 43 for each Phase of 
the analysis of significance of differences, 


Summary and Conclusions 


This study has been concerned 
ing the relationship of certain 
and personal history data to the 


with study- 
Psychological 
accident rec- 


. Parker, Jr. 


ords of a sample of commercial truck drivers. 
The accident data used were of two types— 
preventable accidents and non-preventable ac- 
cidents, and were equated for the number af 
miles driven. 

The subjects for the study were 104 com- 
mercial truck drivers employed by a large 
East Coast trucking concern. 

For each of the criterion groups two types 
of analyses were made: (a) the significance 
of differences between the upper and ws 
halves of the accident group was compute : 
and (b) the significance of differences oa 
tween the accident and non-accident group 
was computed. In addition, a Weert ee 
little Shrunken Multiple Coefficient of wi 
relation was computed for four of the wt 
ables and the criterion score based On 
first analysis of significance of differences: : 

From this study the following conclusio” 
can be drawn: re- 

1. A difference seems to exist between P" 
ventable and non-preventable accidents. 

2. Psychological traits, as well as Se” 
capacities, are important in analyzing nts; 
accident liability for preventable ae 
while only personal history data and ae 
capacities seem to be important in analy 
the accident liability for non-preventable 
cidents, j 

This study also demonstrated the 
cability of a technique for controllin 
posure to accident hazard; that is, usi” 
number of accidents per unit mileage 1? 
than just the total number of accidents: 


Received January 30, 1953. 
Early publication, 


sory 


appli 

ex- 
> the 
thet 


References cho 
L Edwards, A. L. Experimental. design i p5 gs0: 
2 logical research, New York: Ri ch 
- Guilford, J, P, Fundamental statistics 1” p york 
°8y and education. (24 Ed.) New 
Ge McGraw-Hill, 1950, 
- Stead, W, Hi Shartle, C. L., et al. 
counseling techniques. New York: 
Book Company, 1940, 


Y 


THE JOURNAL oF 
a se ee APPLIED Psy cy 
Vol. 37, No. 4, 1953 PsycHoLocy 


Applied Psychology in Action 


A New Management Tool for 
Top Executives 


Seon. first issue of Social Science Reporter 
is ared on April 15, 1953, and is to be 
Sued semi-monthly. It is an effort to build 
be of better understanding between so- 
Am scientists and the business leaders of 
F ee The editor and publisher is Rex 
Calif arlow, 365 Guinda Street, Palo Alto, 
the nana, who wishes to be kept informed on 
ee esearch that is being done. His staff will 
thro seek to secure material for publication 
the ugh Personal interviews with scientists, 
sr iiig of scientific periodicals and at- 
ance at scientific meetings. 

a. first issue contained abstracts of ten 
me projects such as: management organi- 
A ion and motivation (W. F. Whyte); man- 
ement and the individual in an organization 
met Wight Bakke) ; multi-relational socio- 
‘eric survey (I. R. Wechsler, R. Tanne- 
vaum, and E, Talbot); administering changes 
arriet O. Ronker and P. R. Lawrence); 
ecutive retirement (H. R. Hall); manage’ 
hent-employee communication (Nat. Soc. of 
tof. Engineers) ; improving supervisors (N. 


a Maier) ; pension-plan policies and practices 
(M. Puchek); changing attitudes (L. ni 
n (D. 


l f 
ger and H, Kelley); mass persuasio 
artwright). 

nenn 


Background of an Industrial 
Psychologist 

enclosure of per- 

and value to all 

gy to work in 


soe starts from a letter and 
ao data may be of interest 
po ucetned with putting psycholo s 

ùsiness and industry. After congratulating 


r. Mari z her paper On “Que 
ion A, Bills for he r for develop- 


-Xpanding Responsibilities” an c 
ey fe Applied Psychology in Action section 
ihe April, 1953 issue of the 
“M Dr. Jobn F. 
me than ten years 48° I 
tome a psychologist with both feet solidly 

“nched in industry. Consequently ae 

a series of jobs each selected with 


Criterion being the extent to which it woul 


Journal of Ap- 
Michael wrote: 
be- 


Psychology, 


contribute to a comprehensive industrial and 
business background. The two page enclo- 
sure, which might interest you, described this 
experience in greater detail. 

“This attitude toward my professional re- 
lation to industry found great support in 
L. L. McQuitty’s article in American Psy- 
chologist of several years ago. He also felt 
the great need of first being in demand to 
business not solely because of one’s psycho- 
logical background but mainly because of 
one’s business background. The implications 
for the problem of communications are obvi- 
ous. This is notwithstanding the fact that 
one of our largest consulting firms in indus- 
trial psychology in selecting their own staff 
maintains they do not particularly consider 
any possible business experience of the psy- 
chologists they hire. 

“Tt seems fashionable to decry the small 
published output from psychologists spending 
all their time in industry. I fail to agree 
with the typical reasons offered by others for 
this. The main cause appears to me to be not 
any dire result of greater income, or any de- 
crease in the desire to contribute to the gen- 
eral advance of the field, but merely the re- 
sult of the lack of time and the need to sell to 
management the ‘problem’ of research. As a 
result it is almost impossible to do research 
as defined by other colleagues who have more 
time. An interesting support of this time fac- 
tor might be the reading habits of the psy- 
chologist in industry. My own observation 
with reference to myself and others is that 
there is an inevitable decrease in professional 
reading from the dropping of the journal club 

selection of a few jour- 


subscriptions to the 
nals none of which are completely read. I 


feel that such a decrease has not taken place 
so widely among our academic colleagues. 
“The proposed feature is a most excellent 
Comments to be accepted for 
publication should not be restricted too much 
with reference to content. Observations 
worthy of possible restatement into testable 
hypotheses, needs for specific techniques, 
notes as to application of psychological prin- 
ciples to unique problems in business are ex- 


suggestion. 


322 


amples of possible contributions. In gen- 
eral, the basic characteristic should be brev- 
ity. This proposed section should give the 
psychologist in industry a voice by which to 
record some of his many varied actions which 
might be of interest and value to others. The 
thought of writing a comprehensive article 
likely has stifled the expression of many a 
good idea which, perhaps, would have re- 
quired only several short sentences anyway. 
These ideas should not be lost. 

“I fully realize nothing has been said to add 
any value or new points to Dr. Bills’ speech. 
It is just a case of my being very pleased 
with her excellent discussion of the psycholo- 
gist in industry with special reference to those 
of us who do not wear a tag which says ‘psy- 
chologist.’” 

Here are the high-lights of this industrial 
psychologist’s background and training: 

Age: 35; Education: AB, BS in LS, MA, 
and BBA, Western Reserve University; PhD, 
Ohio State University in 1952, Majors: 
Bibliographic research, general business ad- 
ministration, psychological aspects of market- 
ing and merchandising. 

Work History: (1) Business reference as- 
sistant, city public library, 3 years; (2) Re- 
search analyst, procurement division, Quar- 
termaster Depot, 3 years; (3) Market ana- 
lyst, for a large tire and rubber company, 3 
years; (4) Jobs in The Lincoln Electric Com- 
pany of Cleveland, Ohio, since January 1951: 


Applied Psychology in Action 


automatic saw operator, plant layout de- 
tailer, assistant purchasing agent, factory €n- 
gineer, management analyst, and at present 
management engineer. i 

Professional Organizations: (1) American 
Psychological Association, Associate member 
since 1947; (2) American Marketing Associa- 
tion; (3) Cleveland Psychological Associa- 
tion; (4) Personnel Psychologists of Northern 
Ohio. 


Noise and Absenteeism 


A study reported by Dr. de Almeida is con- 
cerned with the maintenance of a work oy 
of tabulating machine operators. He wa 
called in because of absenteeism in a tabulat- 
ing department of 125 workers. Noise W2 
found to be a real source of trouble. He 
found that this could lead to aural lesions at z 
neuropathic troubles, He recommended ai 
arrangement of equipment, placing machina 
on rubber supports, and “Cellotex” inst 
tion of walls. 


Jectric 
eal: 
5- 


Pal Aae 


Book Reviews 


Heneman, H. G., Jr. and Turnbull, J. G. 
(Editors). Personnel administration and 
abor relations: a book of readings. New 
York: Prentice-Hall, Inc., 1952. Pp. xiv 
+434. $3.95. 

Pigors, P. and Myers, C. A. (Editors). Read- 
ngs in personnel administration. New 
York: McGraw-Hill Book Company, Inc., 
1952, Pp, xii + 483. $4.50. 

t A lihoiigh not explicitly described as such, 

readi wo publications serve as supplementary 

agen, ie sources for Yoder’s Personnel man- 
and a and industrial relations and Pigors 
tivel yers’ Personnel administration, respec- 
differs They reflect the same similarities and 
atte ences as the basic texts themselves. Both 
with we to present selected readings dealing 
mer E various aspects of the field, the for- 
ems stressing the “practical operating prob- 
relati that confront the personnel man, labor 
es man, or union leader in everyday 

Sonn and the latter “the philosophy of per- 

ttt Administration, its basic problems and 

raisi ations, as well as criticisms and doubts 
ed by union leaders.” i 

ion neman and Turnbull group their selec- 

du S under four parts: The Setting of In- 
Strial Relations; Personnel Administration; 

bor Relations; and Research and Evalua- 


t à a ; 
of ti They present 170 readings in all, many 
ce hem not being the entire article but ex- 
: fically with the 


to Pts selected to deal speci 


Dic under consideration. This method of 


br h 
“Senting readings has certain advantages 
Isadvantages. It reduces overlap an 
same time 


ree ‘cation of content but at the io 
a Tes a clear over-all structure to tie 5 a 
Eaa together. This the authors have, : 
Pted to provide by including introduc 

é cA Statements before each part and by ae 
each € a brief abstract, at the beginnins S 
readi Chapter, of the main points O° pe- 
inc ME and its contribution to the topic 3 
of Covered, Despite the editors’ cate, 50mg 
the |e excerpts still “hang in mid-air ee 
at řeader may feel a need for the omit 

“tlal which structured the original articles. 

46 rea mtrast, Pigors and mye ae 
‘ ngs, organized under Six Pal”. 
aes me Spei Pesonna Administratiot ; 
“lyzing and Handling Personnel Problems; 


323 


The Foreman;~ Building and Maintaining 
Work Teams; Wage and Work Assignments; 
and Employee Services and Programs. The 
individual readings tend to be longer and to 
be more self-contained (than those in Hene- 
man and Turnbull) and appear to be the com- 
plete original rather than excerpts. They also, 
however, are organized under sub-topics and 
each part includes an introductory statement, 
by the editors, which interprets the reading 
and shows its contribution to the topic being 
treated. 

It would be impossible to rate one of these 
publications as better than the other; they 
are just different. Although both provide 
adequate coverage of the general field and 
have eight authors in common, there is no 
selection which appears in both. Each has 
its favorite sources—Heneman and Turnbull 
the University of Minnesota Industrial Rela- 
tions Center, and Pigors and Myers the Har- 
vard Business Review—but both draw upon 
American Management Association publica- 
tions, the Personnel Journal, recent books, 
and related fields. 

Both of these publications are worth being 
in the industrial psychologist’s library. They 
have a double yvalue—for what is in them and 
for what is not. The readings themselves are 
drawn from a wide variety of sources and 
disciplines and indicate clearly that the indus- 
trial psychologist, to be truly effective and to 
be able to communicate with management, 
come conversant with a wealth of 
e outside of psychological journals. 
s are becoming recognized 
the field is indicated by the 
fact that slightly over 10% of the authors in 
each book are psychologists. However, these 
are the psychologists who have written articles 
for the personnel journals or who have partici- 
pated in conferences run by personnel man- 
agement groups. It seems clear that psy- 
chologists cannot be too modest and coyly 
wait for management to wade through psy- 
chological journals to find out what psychol- 
ogy has to offer. At any rate, only a very 
few of the readings in these two publications 
are taken from psychological journals. 

It is also instructive to note in which areas 
authors from psychology are used. As one 


must b 
literatur 
That psychologist 
as contributors to 


324 


might expect, Selection and Placement, Train- 
ing, and Research are those in which the psy- 
chologist is recognized as making the most 
unique contribution, but other areas are be- 
ginning to be influenced, areas such as mo- 
rale, safety, job analysis, and even adminis- 
tration. However, if these two publications 
are a guide, psychologists have still not made 
themselves felt as contributors to an under- 
standing of labor-management relation, incen- 
tive plans, union activities, grievances, disci- 
pline, etc. Either we have nothing useful to 
offer on these problems or have not made it 
readily available to nonpsychologists. I hope 
it is the latter. Albert S. Thompson 


Teachers College, 
Columbia University 


Walker, C. R. and Guest, R. H. The Man on 
the assembly line. Cambridge, Massachu- 
setts: Harvard University Press, 1952. Pp. 
175. $3.25. 

The authors have contributed this pilot 
study in an effort to increase our general 
knowledge of the adjustments made and the 
satisfactions derived by workers on an as- 
sembly line. 

It is a thorough, well-organized inquiry into 
the expressed feelings and attitudes of these 
men regarding many facets of their work. 
The book describes in detail the work climate 
at Plant X which is an automobile assembly 
plant. The method of investigation was both 
qualitative and quantitative, giving a clear 
picture of attitudes toward seven character- 
istics of the work. These, as listed by the 
authors, were: “the worker’s immediate job, 
his relations to fellow workers, pay and se- 
curity, his relation to Supervision, general 
working conditions in the plant, promotion 
and transfer, and his relation to the union,” 
These attitudes were determined by inter- 


views, questionnaires, and the study of overt 
behavior. 

The results of this investigation are not 
new or startling. It is a typica 
tudes of employees toward th 
tion, using the usual attitude 
odology showing us which aspe 
were most liked and disliked. 

Unfortunately, the author. 
size the main reason the em 
work. “Pay.” This critici 


l study of atti- 
eir work situa- 
research meth- 
cts of the work 


s do not empha- 
ployees liked their 
sm can be levelled 


Book Reviews 


at many other social scientists of course. Per- 
haps it is because this aspect is considered out 
of their realm and they can’t do much about 
it. On the other hand, they have come up 
with some realistic suggestions for alleviating, 
at least in part, the main dislikes which these 
men expressed concerning their work which 
was the immediate job content. This iae 
inquiry into the specific aspects of the J03, 
content itself which the men objected to 15 
the only unique contribution of the book. 
Evidently, it was not hard work itself but m 
mechanical pacing and the repetitiveness © 
the job to which they objected most pee 
ously. Walker and Guest have proposed wha 
appear to be practical suggestions for mini- 
mizing these objections. be 
The essence of the problem seems to he 
that we have progressed too rapidly int a 
technique of mass production and it is le 
now to stop and re-organize our thinking a 
take into consideration the worker himself } 
order to solve some of the problems eng 
Many industrial engineers and member: 
plant management, however, are going tO om 
agree until we can prove its economic e 
tiveness, John M. Coo 
Radio Corporation of America, 
RCA Victor Division, Camden, N. J. 
ical 
Laird, D. A. and Laird, Eleanor C. Pratt 
sales psychology. New York: McGt® 
Hill, 1952. Pp. xii, 201, $4.00. ihis 
The formula adopted by the authors of jnt 
book is clear and easy to follow up to a P o 
Take self-administering tests of tens 
eighteen items each and label them selis" 
ciency, dominance, self-confidence, social ™ 
ing, friendliness, hygienic habits, fatigue ted” 
fulness, kindness, sympathy, warm-hea jes 
ness, optimism, ability to meet emerge? ult 
Considerateness, snobbishness, egotism, `‘ ed- 
finding, self-centeredness, and hot-temp® ov 
st This will show the reader oon pant 
and in some qualities which are pra als? 
for the Salesperson” (p. 117). It W cho 
p to convince him that scientific P3 
a underlies your approach. ts S 
n tae with authoritative es eo 
Sie Cut one person out of four “eco, 
eristics which make him disag!e®" yh 
(p. 168 “Records show that graduate oof 
received poor grades in college become a 


a 


Book Reviews 


salesmen as those who had high grades” (p. 
a $ ‘Psychological studies have shown that 
o A of selling or ‘sales drive’ is essential 
Git ales success” (p. 13). “Ninety-eight per 
b of experienced sales people like selling 
etter than any job” (p. 15). 
lo your points with “studies.” For ex- 
af p im you want to demonstrate that love 
te can be developed, quote a study 
ne shows that the longer men have been 
iis fire protection equipment the better 
th y like it. Do not mention the possibility 
at self-selection rather than development 
accounts for the results. 
wee with references to big names and 
N SN of the Horatio Alger type. Mention 
at W. Ayer, Marshall Field, John Wana- 
waa A Owen D. Young, Abraham Lincoln 
the Colonel Edward M. House. Tell how 
Yy succeeded. In the case of Colonel House 
Mote him as saying, “Let the other fellow 
Make the mistake first.” Point out that 
' ouse gained political power in the Wood- 
ow Wilson administration by following that 
Policy” (p, 264) f 
z © not allow difficulties to disturb you. Ii 
alit Concept of the salesman implies a gener- 
sellin which does not exist, if what is true of 
bonde Soap is not necessarily true of selling 
stion. say so and then forget it. If the cri- 
establi of success in selling is SO, difficult to 
s ish that it represents a major problem 
go, PS¥chologists in that field, don’t even say 
A Ignore it. 
this. the other hand, abash reviewers such as 
that tne by writing with a style and ea 
ar exceeds what your more conservativ 
es ie produce, Demonstrate that you have 
types ed the best of the common-sense seme 
have about the sales situation and that A 
sis, 2 Tate gift for organization and emp va 
es eat all of the “Sales Power in Five Easy 
Sons” boys at their own game. Convince 


Sven 2 
me, the psychologists that, atter all, aN 
read your 


ie are probably better off to rea Lom 
them most of the literature that is NOW giv 
teat do not say that “The first attempts © 
lation Something about human nature = s 
Dlaus; to the customer produced @ o : 
boo Sible-sounding nonsense. Magazines, m 
Which Were filled with ‘how tO sell’ adv 4 
1 made psychologists, WhO knew huma 


325 


nature, laugh aloud” (p. 38). Who's laugh- 
ing? S. Rains Wallace, Jr. 
Life Insurance Agency Management Association, 
Hartford, Connecticut 


New York: 
Pp. 256. 


Mental prodigies. 
1952. 


Barlow, F. 
Philosophical Library, 
$4.75. 

This book is described on its title page as 
“An enquiry into the faculties of arithmetical, 
chess and musical prodigies, famous memo- 
rizers, precocious children and the like, with 
numerous examples of ‘lightning’ calculators 
and mental magic.” Actually most of it is 
devoted to lightning calculators and to ex- 
amples of the mathematical tricks, stunts 
and problem solving used in performances 
of “mental magic.” The author is himself 
something of a “mathemagician,” having 
given performances using his own mnemonic 
devices and calculating short-cuts. His in- 
terest in such matters dates back to the nine- 
teen-twenties or earlier and has led him to in- 
terview several arithmetical prodigies. For 
the most part, however, his knowledge of 
prodigies is second or third hand. 

The first chapter, 58 pages in length, is an 
interesting account of nineteen outstanding 
arithmetical prodigies of the last two and a 
half centuries, together with brief notes on 
twenty-two others less well known. The 
large ‘majority of both groups were born be- 
tween 1700 and 1870, and only seven of them 
since 1900. For the outstanding cases the 
author gives where possible date of birth and 
of death, nationality, age when the special 
gift became evident, amount of education, ex- 
amples of calculating feats performed, and an 
estimate of general ability. Here he draws 
heavily from Scripture’s 1891 article on 
«Arithmetical prodigies” in the American 
Journal of Psychology, and from Mitchell's 
1907 article on “Mathematical prodigies” in 
the same Journal. As most of the cases re- 
d upon antedated scientific psychology, 
histories, with a few notable exceptions, 
etchy and poorly documented. The au- 
i itically hear-say evidence, ar- 
pers and popular maga- 
psychic research. He 
also has a few systematic biases, including: 
(1) the usual bias of one who has dabbled in 


psychic research; (2) an admitted “instinc- 


porte 
their 
are sk 
thor cites uncr 
ticles from old newspa 


zines, and writers on 


326 


tive dislike of precocious children”; (3) a 
tendency to exaggerate the one-sidedness of 
prodigies and to underestimate their general 
ability. The third of these may be an out- 
come of the second. For example, the Bel- 
gian boy Verhaeghe is described as “an ado- 
lescent of seventeen with the mental age of a 
babe of two years.” Yet this boy, when ex- 
amined by a committee of mathematicians in 
1946, gave the fourth power of 1,246 in 10 
seconds, the sixth root of 24,137,585 in 25 sec- 
onds, and the square of 888,888 888,888,888 
in 40 seconds! 

The arithmetical prodigies whose nationali- 
ties are given distribute as follows by coun- 
try: Britain 10, France and Italy 7 each, 
United States 5, Germany 2, and 1 each for 
India, Ceylon, Greece, Belgium, Spain, Swit- 
zerland, Egypt, and Mexico. Britain’s lead 
is no doubt due to the fact that the author 
has covered his own country more thoroughly 
than other parts of the world. Of the United 
States’ five, two were Negro slaves (one of 
them born in Africa), and two were from the 
little state of Vermont. The author’s list in- 
cludes two very famous scientists (Gauss and 
Ampère), a father-son pair (George P. Bid- 
der, Sr. and Jr.), and six university graduates 
(both Bidders, Gauss, Ampére, the American 
Safford who became a professor of astronomy, 
and Whately, an Archbishop of Dublin). Sev- 
eral others gave evidence of superior general 
intelligence. At the Opposite extreme three 
were considered mental defectives, though the 
basis for such classification is dubious in two 
of the three cases. Several described as dull 
were almost certainly well above average in 
general ability. Of the 41 arithmetical prodi- 
gies, four were sheep herders in childhood 
two were blind, two had 12 fingers and 12 
toes, and one was born without legs or arms, 
The relatively high incidence of sheep herd- 
ing and of physical anomalies is h 
plicable in terms of chance, 
ratio of 39 males to 2 female 

The book would have been 
to psychologists if it had in 
chapters on History and Data (5 
The Calculations Considered (15 pce 

? e- 
velopment (6 pages), Famous Memorizers 
(19 pages), Mnemonics 


e (9 pages) and Men- 
tal Magic (53 pages). The remaining ‘ao 


ardly ex- 
nor is the sex 
S. 

more acceptable 
cluded only the 


Book Reviews 


dred pages are a hodgepodge of naive com- 
ment on such topics as heredity and instinct, 
mental imagery, chess and musical prodigies, 
precocity and genius, and the subconscious. 

The book is entertainingly written. The 
authenticated feats of calculation and memory 
which the author has recounted are fascinan 
ing and astonishing. The chapters on men 
magic will intrigue many readers by the re 
lations given about such things as naming t ; 
day of the week on which a dated to fk 
occurred or will occur; short-cut methods i 
a great variety of computations, including 
among others squaring, cubing, root extra x 
tion, translating months or years into seco” ie 
and miles into inches or barleycorns; and as 
rections for carrying out such parlor stunts “g 
“think of a number,” “draw any card, $, 
loan and a present,” “change for a shillings 
and numerous others. By showing that m 
feats which appear so difficult can be telf 
tered by almost anyone with a modet ia 
high IQ and a fair memory, these chap o 
can be expected to increase the number _ 
amateur mathemagicians whose performat j 
will probably amuse, mystify, and bore U 
about equal proportions, 


n 
Lewis M. Terma 
Stanford University 


e 
Dunsmoor, C. and Davis, O. How t0 oe 
that college. Boston: Bellman Publ i 

Co., 1951. Pp. 51. $.90 (cloth hoi 
The rapidly expanding number of sta ed 
Considering college training makes | poice® 
for materials to assist them with theit ae te? 
Increasingly urgent. This book was W" to 


3 : which Pi 
discuss many of the questions whic? | at 


typically have. 
tractively illustrated. The authors syst sro)” 
tio” 
requirements, appie ad 
; nd making good “tty 
mitted. This is quite an order for "7 
pages, A rel 
To cover this ground, the autota, o? 
heavily on lists of things to do and av at 
typical recommended high school C% gue”, 


: fred" k 
considerable oversimplification, and he bar 
Suggestions to see a counselor. T 


fof 
ite 
would, therefore, seem to be best SW 


TEE 


Book Reviews 


ae guidance class; without expert fill-in, 
ae students and their parents would prob- 
oa either be left with unanswered questions 
r be misled. j 
See plays an interesting dual role in 
left . On the one hand, the student 1s 
self he large measure to decide for him- 
and ee he has the right goals, interests, 
ae This is in spite of the ample 
appro € evidence regarding the tendency to 
istics priate to the self desirable character- 
hand r such circumstances. On the other 
such ‘thin, student is sent to the counselor for 
ters of ings as assistance with writing of let- 
ceipt feplsation and, immediately on re- 
his acceptances, for “review” of these. 
Colle, oficials. dependency likely to plague 
Bie officials and counselors later On. 
pene! handled, this book should make a 
ance ae to the literature on guid- 
simplif ut, because it has been almost ovi 
wheth ied and condensed, it is doubtfu 
as Thes it should be turned over to students 
cult t sole and sufficient guide through a diffi- 
pect John W. Gustad 


Universi 
niversity of Maryland 


Argeyr; 
eg C. An introduction to field theory 
nd interaction theory. New Haven: Yale 
e 1952. Pp- 


ma and Management Center, 

a $1.00. 
oem brief 71-page pamphlet ey 
agem red for members of the Labor and = 
Sent ent Center at Yale University. ne 
fous in each case are some basic touni i 
Ving of each theory, some assumptions Un G ; 
theo cach theory, some of the goals of g à 
fac ty, some basic operations suggeste Py 
Cone theory, and some of the more impor ar 
ing ae utilized in each theory- The w 
ang S clear in so far as the somewhat turg 
ia of the authors, whose W 

zed, makes that possible, a” cS 

ries 38 the extreme complexity of the aes 

ñelq oe allows. As 4 presentation re 

this heory and interaction theory, therefore, 
ook may be recommended. 

e = € are several points, however, 
the fi itical reader may well 
to lik. place, the author makes no 4 
ing wi up the two sets of theories he } h 

With. Is field- theory compatible © 


was originally 


on which 


Szi 


interaction theory? Do they deal with the 
same aspects of human behavior? Are there 
alternative descriptions of similar phenomena? 
Ts there any evidence of an experimental kind 
to make a choice between them possible? No 
information is given on these vitally impor- 
tant points. 

Again, in what way do the concepts and 
ideas of Lewin, on the one hand, and of Chap- 
ple, Arensberg, Whyte, and Homans, on the 
other hand, constitute a theory? In science 
the term theory is ordinarily used to denote 
a system of hypothetical constructs or inter- 
vening variables which enables us: (a) to 
summarize existing knowledge; and (b) to 
make deductions leading to testable predic- 
tions. This clearly is not the purpose of 
either field or interaction theory. Both are 
essentially semantic attempts, occasionally 
leavened by the use of unorthodox mathe- 
matics, to persuade the reader to accept un- 
proven views and to make use of complex 
ways of stating what often appear to be ob- 
vious, commonsense notions. Argyris never 
comes to grips with the problem of proof and 
disproof of theories of this kind, nor does he 
how that they have a function to 


attempt to $ 
perform which could not equally well be per- 


formed in other ways. 
This failure to provide any form of critical 
discussion is the major weakness of the book. 
The unwary reader might think from Argyris’ 
presentation that Lewin’s use of topological 
concepts, OF hodological space, is acceptable 
to orthodox mathematicians. This, as far as 
I know, js not so, and one might have ex- 
ected at least a brief reference to criticisms 
of these, a5 of many other points. A book of 
this kind will not be very useful to the initi- 
ated, who will be familiar with its contents in 
any case; nor will it help the uninitiated to 
obtain a palanced view of the theories pro- 
osed. The habit of stating a position with- 
out answering important criticisms of that po- 
sition, or eve? mentioning that such criticisms 
exist, is unfortunately widespread in psychol- 
ogy. This book is an outstanding example of 
what, to the reviewer, appears to be a very 


ractice indeed. 
BS H. J. Eysenck 


psychology Department, Maudsley Hospital 
London, England 


New Books, Monographs, and Pamphlets 


Paterson, 
isti i vi hould be sent to Donald G. 
k. hs, and pamphlets for listing and possible review s d 1 r : 
Poni M Editor. Denartnent of Psychology, University of Minnesota, Minneapolis 14, Minnesota 


Introduction to the Rorschach technique. 
Robert M. Allen. New York: Interna- 
tional Universities Press, Inc., 1953. Pp. 


126. $3.00. 
Introduction to exceptional children. Re- 
vised edition. Harry J. Baker. New 


York: Macmillan, 1953. Pp. 500. $5.00. 

Social psychology. Haubert Bonner. New 
York: American Book Co., 1953. Pp. 439. 
$4.25. 

Progress in Clinical psychology. Vol. I, Sec. 
1. Daniel Brower and Lawrence E. Abt, 
Editors. New York: Grune and Stratton, 
1952. Pp. 328. $5.75. 

The individual and world Society, P. E, 
Corbett. Princeton: Center for Research 
on World Political Institutions, Princeton 
University, 1953. Pp. 59. Gratis. 

The MMPI: a review. William C. Cottle. 
Lawrence, Kansas: School of Education, 
Univer. of Kansas Publ., 1953. Pp. 82. 

Logic and language. Second series. A, G.N. 
Flew, Editor. New York: Philosophical 
Library, 1953. Pp. 242. $4.75 

The Grassi Block Substituti 
uring organic brain pathology, Joseph R. 
Grassi. Springfield: Charles C Thomas, 
Publisher, 1953, Pp. 75. $3.00. 

Initiating and administering guidance services. 
S. A. Hamrin. Bloomington, Illinois: Mc- 

Knight and McKnight, 1953, Pp. 220. 
$3.00. 

Introduction to statistical methods, 
O. Johnson and Robert W, 
New York: Prentice-Hall, 1953, Pp. 394 

Heredity in health and mental disorder. 


Franz Josef Kallman, New York: W WwW 
Norton Co., Inc., 1953. $5.00. = 


Problem drinkers can be helped 
Lansdown. Devon, En A 
Stockwell Limited, 1953 

Psychology of industrial 


on Test for meas- 


Palmer 
B. Jackson, 


G. N. 
gland: Arthur H. 
Pp. 72. 


j relations, Ç, 
Lawshe. New York: McGraw-Hill Book 
Co., Inc., 1953, Pp. 350. $5.50 

Comparative conditioned i 


neur 
Waldo Miner, Editor, New York: ae 
York Acad. of Sci, 1953, Pp, 370 "$3.50, 


The natural superiority of women. ytd 
Montagu. New York: The Macmillan C0- 
1953. Pp. 205. $3.50. on 

God, labor and management. Alfred sae 
New York: The William-Frederick 
1953. Pp. 28. $1.00. axe 

Group psychotherapy. Florence B. rate 
maker and Jerome D. Frank. Cambri rs 
Harvard Univer. Press, 1953. Pp- 
$6.50. eae, 

The conception oj disease. Walther a 
New York: Philosophical Library. 

Pp. 120. $3.75. 


- A Reiss. 
The universe of meaning. 3 


Samuel 


Anani. 1050" 
New York: Philosophical Library. ! 
Pp. 227. $3.75, p 

Philosophy and the ideological confie, 


Charles S. Seely. New York: philosoph! 
cal Library, 1953, Pp. 319, $5.00. Jo 
The retention of meaningful material. a 
seph Francis Sharpe. Washington, 1952. 
Catholic University of America Press, 
Pp. 66. $1.00. m 
Capitalism overhauled. Job Socius. 
York: The William-Frederick Press 
Pp. 79. $2.00. ast t0 
Introduction to testing and the use ol a 
sults in public schools, Arthur E. T! a 
Robert Jacobs, Margaret Selove pet 
Agatha Townsend. New York: i 
and Brothers, 1953, Pp. 113; $2.00 
Retirement and the industrial wor ker. 
uckman and Irving Lorge. New 
ureau of Publications, Teachers : 
Columbia Univer., 1953. Pp. 10% Aglo 


: : n ica. 
he science of color. Committee 0 meric? 
Imetry of th 


ew 


N 
1952- 


Jaco? 
york: 
le! ey 


e i ociety of 955 

New York: “ieee a eT Co.; 1 4 

Pp. 385. $7.90, vind 

Evaluating research and development: tor? 

R. Weschler and Paula Brown: + pel 

Los Angeles: Institute of Indust"? , 2 
tions, U 


; r g s E 
Niversity of California, 
104. $1.65. j schoo! J 
Research Report of U. S. Naval g, Na" 
Aviation Medicine. Pensacola: U- 

Air Station, 1953. 
328 


Journal of Applied Psychology 


OCTOBER, 1953 


You 3% Now Se 


The Weather and Other Factors I 


nfluencing Employee 


Punctuality * 


Roland E. Mueser 
The Pennsylvania State College 


On ; 

middle geen warm unseasonal day in the 
Usually s] ebruary 1951, the majority of 
early for eepy students arrived bright and 
Promptness 8:00 o'clock college class. Such 
we fyi this hour was as unusual as 
Mvited th ay in February. The coincidence 
tendance e comparison of weather and at- 
Stimulate Did the early morning brightness 

e these otherwise uninspired students? 


eh 
light ae othesis suggested itself; Increased 
nsity might be causing early awaken- 
e of 


8, Or j : 
as ine might hasten the morning routin 
Study we dressing, and breakfastins- This 
lation bets undertaken to determine the corre- 
pne indic ween early morning illumination and 
ed of human activity. Promptness 
Which Ing to work was used as 2 criterion 
Might be accurately measured on & 


Statisti 
Sti aoe 

cally significant population. 

ering research 


hosen because 
dily obtained. 
scheduled to 
ed and a 


la ae ersonnel of an engine 
attenda; ry on the campus was € 
Only TAG figures could be rea 
Start y ose employees who were 

ork at 8:00 a.m. were select 


Stang 

Ardea’ 1: 

a oo list was used for holding @ con- 

Nighy tuple. It was, however, impossible to 
entical group 


y limi 
for oe: the sample to the id 
Dloyege » Practical reasons. Par 


es 
Yaca s necessarily absent 
, or because of illness during intervals 


M 

the I 

"cords recording period. Eliminating their 
Was a prohibitive statistical task am 


* 
Len the 
Dely for thor wishes to tha 


Dons CPI 
the Sible €y’s 


o 
Esen OPerati, 
Dstarch pation of the members 


t of the em- 
on business, 


t in a drastically reduced 
population size. Actually a total of 144 in- 
dividuals were on the standardized list. Of 
these, an average of 132.8 or 92.2% were at 
work in the Laboratory during the test period. 
By extending the study over a number of 
months a gross averaging effect has been 
achieved and should tend to minimize chance 


errors. 


would also resul 


Procedure 


Employees of the Laboratory were checked 
in by guards at the gate and the time recorded 
to the nearest five minute interval. An aver- 
age of 101.3 men and 31.5 women were timed 
six days 4 week from February 23, 1951 
through May 14, 1951, a total of 69 working 

were recorded independently for 


days. Data 
men and women to allow for later comparison. 


The majority of employees drive to work in 
private automobiles and the remainder all 
walk. No public conveyance is employed for 

“on. hence patterns due to stand- 
train schedules are avoided. 
Similarly ptions, as might 
be due to 2 
not present. 


not, of course; 
an incident such as a flat tire affects no more 


than a single car pool, no large error is intro- 
duced by & single transportation mishap. The 
average distance traveled to work was 3.8 
miles for employees, with approximately 667 

e the drive or walk 


living in State College wher 
z is less than a mile. Although the only 


division of the population is sex, in- 
this tends to produce strong second- 


primary 
The women of the Labora- 


herently this 
ary selectivity- 


is sincerely appreciated- 50 


330 


tory are mainly secretaries, clerks, typists, 
and a few technicians. The female employees 
are, therefore, an exclusively non-supervisory 
group. The group of 101 male employees is 
composed 60% of research scientists and ad- 
ministrators. Most of the remainder are made 
up of machine shop and male technical em- 
ployees such as draftsmen and scientific as- 
sistants. 

The weather data were obtained from the 
Meteorological Department of the College. 
The most important information for this 
study, a measure of light intensity, was ob- 
tained from the department’s Eppley pyr- 
heliometer. Readings were taken from the 
record of light intensity at half-hour inter- 
vals from 6:00 through 8:00 a.m. and a total 
of these values was used as a measure of the 
light intensity for the early morning. Ap- 
plying these figures to all employees does, of 
course, involve the assumption that the gen- 
eral atmospheric conditions are the’ same at 
all homes as at the college. The closeness of 
most residences to the Laboratory and the 
averaging effect of a large sample would be 
expected to reduce errors due to this assump- 
tion. 

Nine other meteorological variables were 
observed at 7:00 a.m. These were included 
in the study in order that they might be con- 
sidered as secondary influences on punctuality 
behavior. 

There is a question as to how early and how 
late arrival times are to be tabulated to obtain 
an average figure which is a sensitive indica- 
tor of deviations which atmospheric conditions 

might be expected to introduce. Although 
there is no obvious error in including all em- 
ployees who arrived early, extreme lateness 
would seem due, a disproportionate fraction 
of the time, to purely chance factors rather 
than the interplay of a subtle meteorological 
influence. The flat tire, sick child, morning 
shopping trip, inoperative alarm clock, or the 
morning return from a business trip all intro- 
duce delays which would overshadow the 
effects being sought. A subsequent study of 
employees arriving very late verified the fact 
that these variations occur randomly. It was 
decided, therefore, to study most intensively 
the average arrival time of employees arriv- 


Roland E. Mueser 


ing at work less than 22.5 minutes late. In 
order that this group be balanced a similar 
limit was placed on early arrivals so that em- 
ployees arriving after 7:37.5 am. but before 
8:22.5 a.m. were studied.* Eighty-six per 
cent of all men and 97 percent of all women 
arrived between these times. The average 
arrival times were calculated independently 
for each sex. 

Conjecture would lead one to believe that 
the very early arrivals—i.e. those coming be- 
fore 7:37 aam—would be extremely sensitive 
to meteorological influence since their attend- 


ance is not so keenly forced by the conform: 
ity-producing 8:00 a.m. deadline. Further 
cted tO 


more, as a group they might be expe 
exhibit greater individuality. In other WO" S 
the early birds are more nearly free to do as 
they please and so should react markedly t° 
any atmospheric condition which tends to peor 
duce stimulated or sluggish behavior. 4 
cause of this, arrival times of these early a 
ployees were also studied as an independen 
group. Since no women regularly came s 
work before 7:37 a.m., this computation m 
only possible for men. 


Distribution of Arrival Times 


The distribution of arrival times for 
and women was computed for the js- 
March 9 to May 14. Figure 1 shows e 
tribution of average arrival times of 32 wom 
and 101 men. 

The distribution is similar to that ys? 


times before the 8:00 a.m. deadline. m 
drop-off after 8:00 o'clock is also in at ¢ 


on the right side is expected since 1 
arriving late are exhibiting non-¢ 
behavior. The conjecture seems " d a 
that if the time of arrival were stiPU? cut? 
8:00 a.m. but no social or economic } i 


eas 


e! 

1 Since the raw data of the study were gron te 
five minute intervals all class divisions ar romp” 
half minute and the statistics have bee? 
on this basis. However, to avoid 
dangling .5, intervals are quoted here t9, 
and the additional fraction is to be UP s 
this case 7:37 and 8:22. 


by F. H. Allport (1) called a J curve pe yal 
of the decrement characteristic of a phe 


The Weather and Other Factors Influencing Employee Punctuality 


forcing conformity was applied, the resulting 
distribution might be a Gaussian or normal 
Curve. Conversely as the factors to produce 
Conformity—i.e., to get to work on time—be- 


331 


come more compulsive, not only will the curve 
be displaced to the left (as Allport points out) 


but the skewness of the distribution should 
For the extreme op- 


become accentuated. 


nr 
a 


L 
LING INTO 5 MINUTE INTERVAL 


PERCENT OF TOTAL fà 


Oo 


Fic. 1. Average arriv 


MORNING ARR 
al tim 


VAL TIME 


14, 1953. 


e, March 9-May 


332 


posite of “free-will” attendance there would 
be a situation where the punishment for late- 
ness was so severe that a tardy employee 
would stay absent rather than risk lateness. 
Such a situation is not as fantastic as it 
sounds, for it is close to the actual circum- 
stance where being late in catching an im- 
portant plane or train is as bad as not ar- 
riving at all. : 

In the distribution illustrated here there is 
a marked difference in the average behavior 
of men and women. Whereas an average of 
about 8% of the men arrive extremely early, 
before 7:37, only 0.1% of the women come 
during this period. Similarly, more than twice 
as many men come extremely late, after 8:22, 
as women. Practically no female employee 
arrived at work later than 9:20 a.m., yet a 
few men drifted in every morning as late as 
10:45 am. Beyond this point it is a moot 
question whether it is tardiness or half day 
absenteeism which is occurring. 

The tendency for the attendance charac- 
teristics of men to be more widely distributed 
than women is believed due to the high pro- 
portion of male research and professional 
workers. Research is traditionally an occupa- 
tion catering to individualism and personal 
work habits. Even in a highly structured 
situation where it is generally expected that 
regular hours will be kept, some of the tech- 
nical employees do not take attendance rules 
too literally. The tendency towards pro- 
nounced earliness would seem to be due to 
the greater personal job interest. 
understandable manifestation a; 
where working is a career. 
women employees are perfor 
lating service jobs and few 
longer than a few years. 

The time for men and women 


This is an 
mong men 
Almost all the 
ming less stimu- 
expect to work 


coming be- 
tween 7:37 and 8:22 was averaged for each 
working day in the test period. Tf the time 


which these employees arrive at w 
influenced by a common factor, t 
rival times of the two sexes sh 
positive correlation. Indeed, suc 
a resemblance to split-half met 
puting test reliability. Actuall 
arrival times of men and wom 
relation coefficient of .43 with 


ork is being 
he mean ar- 
ould have a 
h a test bears 
hods of com- 
y the average 
en have a cor- 
a level of sig- 


Roland E. Mueser 


nificance of 0.02%. Although the two groups 
vary in a similar manner from day to day the 
mean arrival time for men, 7:55.52 a.m., 15 
2.6 minutes earlier than that for the women. 

It would be expected that precipitation 
would tend to make employees tardy since the 
majority drive to work, roads become slippery 
under these circumstances, and visibility 1S 
impaired. However, a comparison of rainy 
mornings with dry ones failed to reveal any 
significant difference due to this factor (see 
Table 1). 


Table 1 


n > rees 
Effect of Rain on Mean Arrival Time of Employee 
Arriving Between 7:37 and 8:22 a.m. 


spitation 
Fair Weather Precipital 


(25 days) (44 days) 

z 30 

101 Men 7:55.80 bee 
32 Women 7:58.17 wil 
True Average 7:56.36 7:56.02 


Since it is difficult to imagine that on u 
average no impairment of driving condit al 
existed due to precipitation, it appears n- 
employees foresightedly take bad driving ©° 
ditions into account and tend to compen a 
for the circumstance. It is also possible t A 
some phenomenon adjunctive to rain wa 
reverse effect and tends to produce early Sis 
tendance. Later results lend credence tO 
hypothesis. 


Weekly Cycle of Punctuality 


F «plue 
The Popularity of the expression j 
Monday” and a general totaling of intro Jead 


tive reports following any weekend would ight 
one to believe that the day of the week mares 
have an influence on work attendance fi m” 
A plot of the mean arrival time for me js 
ployees as a function of day of the W® per 
given in Figure 2, Tt can be seen thar the 
1S agreement between the two groups W? wed 
exception of a let-down among men OP ga 


re 
nesdays. If punctuality can be conside rent 
criterion of general feeling tone it is app ase 
that M 


e 

wu, Monday is indeed “blue,” peer d i 

hitting their stride by midweek, and hes- 
© more tardy as the weekend appro2° 


p 
| 


wr 


The Weather and Other Factors Influencing Employee Punctuality 333 


w 

= 

e 

a 

a 

Ss: 

td 

œ 

<a 

w 

o 

<a 

be . 

we: 

> 

a 

Tr 

WED. 
MON. TUE. SAN F WEEK 
i 7:37 and 8.22 a.m. 
Fig, 2, Weekly cycle of average arrival time of people coming between 7 
è l ical Eff 
Mar F Meteorologica ects 
tetiva] eretown of all employees into time ’ a 
ern jp STOUDs per day illustrates a cross-pat- A figure representing early morning right- 


x 3. Punctuality habits as is shown in Fig- 


. 
De More yer employees arrive Just on Time 
ele co Late as the week progresses and 


me ata ing far ahead of time follow a dif- 
ine tss e etn from those arriving just a few 
of Steg tly. The curves are primarily of 

the accuse they illustrate that the shape 


Wi ars e 
of! Vary Val distribution given in Figure 
the Ween what as a function of the day 


ness was obtained by measuring the light in- 
tensity recordings of an Eppley pytheliometer 
at half hour intervals from dawn until 8:00 
am. The sum of these values “I” gives a 
rough integration of total morning brightness. 
Over the February to May period covered by 
the study the average value of J gradually in- 
creased. However, there was no significant 
change in employee arrival times indicating 


334 Roland E. Mueser 


VERY EARLY 
6:57 — 7:37 AM 


‘ 
= ee 


Just on time | 
7:57 - 8:02 AM 


+ 


PERCENT OF TOTAL ARRIVING IN INTERVAL 


mo HH 


WED. 


DAY oF WEEK 


Fic. 3. 


adaptation to the seasonal light change, I 
general, however, the day-to-day fluctuations 
were far greater than the seasonal change, see 
Figure 4. Because of the wide range of values, 
and because psychophysical brightness dis- 
crimination is a relative rather than absolute 


n 


THU. 


FRI. SAT. 


Weekly cycle of Punctuality for men and women 


jd 
phenomenon, light values have been cons 
ered logarithmically (2). aly 
he correlation between the averag© Í, j5 
arrival times and light intensity, 10 Jog 
given in Table 2. 


sigh" 
general the correlations betwee? 


The Weather and Other Factors 


100 


Influencing Employee Punctuality 


99 


DAYLIGHT 


|| SAVING TIME 


EARLY MORNING LIGHT INTENSITY I 


DATE 
| 


12 22 
MARCH 


Fic, 4. Daily light intensi 


2 
FEB. 


an 

109 Brompiness are significant at about the 
ner aky but surprisingly in an inverse man- 
the mee that originally expected. Thus on 
Wor Verage both men and women arrive at 
duy ehificantly earlier when the morning 
Meany . 2nd later when it is bright. The daily 
thingy rival times for men were grouped into 
Slee et those for women into fifths. From 
si it is apparent that the reaction was 


lay ; 
“rin both men and women. The average 


1951 
i 
APRIL 


LE] 2l 


MAY 


2I 
ity from dawn to 8:00 a.m. 


arrival time of women fluctuated over a much 
wider interval than that of the men. The 
men who arrived Very Early reacted in a 
contrary manner to the morning light stimuli 
just as they did with respect to the weekly 
cycle in Figure 3. 

The data were also examined to determine 
whether any of the following meteorological 
conditions might be a factor influencing the 
average arrival times: (1) Corrected baro- 


336 


6:90 ARRIVAL TIMES 


O EACH POINT AVERAGE OF 420 
ARRIVAL TIMES 


© EACH POINT AVERAGE OF 2300 


Roland E. Mueser 


DATA ROUGHLY GROUPED INTO ONE MINUTE CLASSES 


AVERAGE MORNING ARRIVAL TIME FOR IOI MEN AND 32 WOMEN 


8 9 


Fic, 5. 


metric pressure at 7:00 a.m.; (2) Barometric 
tendency in last 20 hours with respect to di- 
rection and magnitude; (3) Amount of baro- 
metric fluctuation, ie. atmospheric pressure 
roughness existing in the previous 20 hours 
(roughness was defined as pressure variations 
lasting less than an hour and including tend- 
ency reversals); and (4) Change in light in- 
tensity in the preceding 24 hours (both direc- 
tion of change, and curve steep) 


sidered). 


mess were con- 


: 1.0 
EARLY MORNING 8 


i1 1.2 | 
RIGHTNESS, 10 


Effect of light intensity on average time of arrival at work 


e 
Mills (5), Winslow (6), and others m 
noted various psychological effects Of ? er, 
metric pressure on human beings. HOY cant 
none of the listed factors showed a sig sjon: 
correlation with the punctuality crit ped 
The highest correlation coefficient OP eut? 
Was + .12 between the barometric P!” A 
and the average arrival time of wore b 
slight correlation in this direction WY” guh 
expected as arising from the fact tee 


: n 
overcast mornings are more common whe 


—* 


ed i ee 


The Weather and Other Factors Influencing Employee Punctuality 


Table 2 


Correlation of Employee Punctuality and 
Early Morning Light Intensity 


Morning Brightness, 
10 log I 


Level of 
Significance 


Correlation 
Coefficient 


A i i 
Verage arrival Lime most 


Men (7:37-8:22 a.m.) +.22 1% 
Verage arrival time most 
Men after correction 
for weekly cycle +.22 1% 
ee arrival time most 
Sag (7:37-8:22 a.m.) +.16 19% 
ge arrival time most 
os after correction 
w eekly cycle +.24 5% 
i age arrival time very early 
en (before 7:37) —.26 3% 


prometer is low. The correlation coefficient 
ee change in barometric pressure and 
i iooaecd was only .01 and other factors 
Similarly low values. 

Sorc studies have shown that tem- 
Boiss and humidity factors are of physio- 
Period importance (4). Nevertheless, over the 
jects Studied, the environment of the sub- 
cial pa almost entirely controlled by arti- 

neans, home furnaces, car heaters, etc., 


rath, i 
“r than by the prevailing meteorological 
lation of such 


mons, No important corre 
n os temperature, humidity, wind di- 
or wind velocity with employee at- 
Year 5s Was evident. At this time of the 
me are largely protected from di- 
€r influence by house walls and 
a These enclosures probably 
r, Psychological and physiologica 
to rege aifestations. However, ator 
“Netrat, ure and light intensity would see A 
n wing © such barriers to a greater exten 
» Cold, or humidity. 


a Discussion and Summary 


ip, be 

ates at Studied here covering 8000 arrival 

Pat. kton a three-month period indicates that 

thea g employees arrive at work M ts 
brig Pparently inversely oe 


h 
tness of the morning light. 


337 


trend appears equally true for both sexes, 
with the exception of about 6%, all males, 
who come to work far ahead of the official 
starting time. Since their early arrival in it- 
self sets these 6% as rather an individualistic 
group, it is not surprising to find that their 
reaction to both the weekly “fatigue” cycle 
and the light intensity is the converse of the 
rest of the workers. 

In general, these reactions by employees 
would seem to reflect on their attitude about 
their jobs rather than serve as any reliable 
indication of feeling tone or satisfyingness (3). 
Thus it is easy to imagine that when it was 
sunny and beautiful outside the chore of earn- 
ing a livelihood was put off. Perhaps a few 
fathers played a little longer with their chil- 
dren or paused to sniff a crocus on their way 
to the car. On the other hand on a dark, 
dismal morning, more often than not, the 
tired secretary and sleepy engineer drank 
their coffee more quickly and set off to work 
promptly and without fanfare. 

Those men who arrive very early would 
seem to regard their work differently than the 
majority. One can only conjecture that they 
eagerly hurry to the job in the early morning 
sunshine, anxious to start another day of 
activity while their compatriots dawdle an 
extra two minutes admiring the blue skies. 
A world which produces both Stoics and 
Epicureans should not find such diverse be- 


havior surprising. 
Received November 13, 1952. 


References 


H. The J-curve hypothesis of con- 
chavior. In Readings in social psy- 
New York: Henry Holt, 1947. 

dying vision. From Methods 
New York: Wiley and Sons, 


1, Allport, F. 
forming b 
chology- 

Bartley, S. H. Stu 
of psychology. 


1948. 
A. G. Studying motor functions and efi- 


w 


ee sa From Methods of psychology. New 
York: Wiley and Sons, 1948. 

4, Hirsh, J Comfort and disease in relation to cli- 
: mate. Climate and man. Washington, D. C.: 
U. S. Government Printing Office, 1941. 

5, Mills, C. A. Medical climatology. Springfield, 


Jll.: Charles C Thomas, 1939. 
Winslow, C.-E. A, and Herrington, L. P. Sub- 
j jective reactions of human beings to certain 
outdoor atmospheric conditions. Heat., Pip- 
ing & Air Conditioning, 1935, 7, 551-556. 


a 


THE JOURNAL or APPLIED PSYCHOLOGY 
Vol. 37, No. 5, 1953 


Prediction of Turnover Among Clerical Workers 


Philip H. Kriedt and Marguerite S. Gadel 


The Prudential Insurance Company, Newark, N. J. 


Companies like The Prudential Insurance 
Company which hire a large number of High 
School girl graduates to do routine clerical 
work frequently have a turnover problem. 
We find that among the High School girls we 
hire each year some become permanent em- 
ployees and make a career of their jobs. A 
larger number work for a few years and then 
quit to become housewives and raise a family. 
Both these groups we feel are good invest- 
ments. There is a third group of new em- 
ployees which concerns us, however. They 
are the girls who leave in a year or less to 
take other jobs or to go to college. These we 
consider to be a turnover problem. 

We have done several investigations to see 
if we can reduce our turnover rate by de- 
termining at the time of employment whether 
or not a girl is a good turnover risk. Some 
of our most recent findings from this research 
are summarized in this article, 


Predictor Measures 


All High School girls hired in June, 1951, 
were given an experimental battery of tests 
and questionnaires selected as possible pre- 
dictors of turnover. The battery included a 
measure of intelligence, a measure of clerical 
aptitude, an interest questionnaire, a bio- 


graphical data blank, and a job preference 
questionnaire. 


1. General ability or intelli 
by two tests: Vocabulary and Arithmeti 
ing. Scores for these two tests ate cane: 
for purposes of predicting turnover, ne 

2. Clerical aptitude was measur 
Name Checking, Number Checking by four tests; 
Letter-Digit Substitution. These four scores. 
combined to give a single clerical speed test score 

3. Interest scores were obtained from a no 
tionnaire developed by the Company condition 
of 285 items similar in form and content to ie 
used by Strong. A key to predict turnover, nd 
sisting of 15 items scored by unit weights. = 
developed from data obtained in a para "r NaS 
A longer key consisting of 43 items did ae 
validate as well as the sho 


gence was measured 


rter ki 1d not cross- 
identifies poor turnover risk ey. The key 


Pang a k S as gi x ig 
artistic, literary, scientific, selling, fen like 
3 = 


ice activities, and who dislike manual, mechanical 
and clerical activities. i SE 

4. Biographical information was obtained in 
blank including both factual and attitudinal ques 
tions related to educational and family pi 
ground. Fourteen multiple choice questions ig! 
given unit scoring weights. Some examples 
questions are these: 

Would you like to go to college if y 
afford it? (Check one) a. Yes; 
No; — c. Not sure. 

Which of the following best 
High School course of study? (Check one) me 
a. College preparatory or academic; —— b: Ca d. 
mercial and secretarial; — c. General; —— 
Other ` 

Which of the following occupationa 
best describes your father’s work during MOS 


ou could 


— 


1 groupe 


. . i al; 
his life? (Check one) a. profession eg 
—— b. managerial or executive; —— “ clerk 
business; — d. clerk in a store; —— © piled 
in an office; — f. salesman; —— BF mie 
trade; — h. farmer or rancher; —— ae j 
skilled (factory worker, miner, etc.); T ow: 
Gites a k. Don’t kn 


_ 5. The last predictor was a job preference a de 
tionnaire which is a modification of the orm aas 
veloped by Jurgensen (3) at the Minneapolis y to 
Company. This form requires the respon agm 
rank 11 factors (Advancement, Benefits: “pay 
mutation, Company, Co-workers, Hours, york 


Security, Supervision, Type of Work, ant npor 
ing Conditions) in terms of their relative "ee of 
tance to her; and also {o rate the importan” pot 


aving a job which is interesting, importa” one” 


strenuous, free from work Past 
enuous, ork pressure. -s fre 
atlas, has much responsibility, and allows 
om for planning one’s own work. 
Procedure and Results en” 
This battery was administered tO p jeft 
ployees in June, 1951. Sixty-five of te fro 
ìn three months or less and 43 more iret 
ed to twelve months after being sg f0 
omt biserial correlations were compe” pre” 
each of the five predictor variables 104 tu” 


o as : A 
mon he validity coefficients gi ‘ 
month and twelve-month turnover # A 
ìn Table 1. soil 
Table 1 shows that the General po, 


ests have g validity of — .25 for per t 


t 
Aai and — .21 for twelve-m0” 


describes YoU - 


Prediction of Turnover Among Clerical Workers 


339 


Table 1 


Point Bi-serial Correlations Between Various Predictors and Turnover for Clerical Employees* 


Predictor 


12 Month Turnover 
(Leaving N= 108) 
(Staying N=250) 


3 Month Turnover 
(Leaving N=65) 
(Staying N=293) 


General Ability Tests 
Clerical Speed Tests 
Interest Questionnaire 
Biographical Data 
Job Preference Blank 


—.25 =a 
03 205 
19 19 
31 .29 
33 .21 


* Negative correlations in this table indicate that those who left scored higher than those who stayed. 


Over, Negative validity means that girls who 
left had higher scores than those who stayed. 
Ina Previous study of eighteen-month turn- 
ver for 1600 girls a validity of —.17 was 
Obtained for these two tests. Clerical speed 
tests have practically zero validity. In two 
Previous studies these tests had slightly higher 
validity, The interest turnover key has va- 
idity of .19 for both groups. This is a cross- 
Validation result as the key was developed in 
“nother study, Biographical Data yields va- 
idities of .37 and .29. Girls who leave, as 
compared with those who stay, more fre- 
ently say they took college preparatory 
Ourses, have fathers in professional and 
GP agerial jobs, and would like to go to 
-ollege if they could afford it. Although these 
€ntical items have not been used before, 
a questions have been used with similar 

Sults. The validities of the Job Preference 
Crgestionnaire, 33 and .21, have not been 

Ss-validated. The key for this measure was 
whe Oped empirically on this sample. Girls 
Dlac eave, as compared with girls who stay, 
. aed more importance on type of work, pay, 
ang on having a job which used their abilities 
Save them freedom to plan their own 


Si 


Worl 

oe hose who left placed less importance 
Dany, „25e who stayed on working for a com- 
anq SY are proud of, on company benefits 
Stenug, cing free from work pressure and 
“esults US physical requirements. Since these 
ot eg LaVe not been cross-validated we di 


p Com, : 
Stereng te intercorrelations between the p 
ad ye CE Questionnaire and other predicto! 


vi s 
itt ml 1d not use Job Preference sori ie 
Mtereg ple correlation solution. We s he 


I a future cross-validation © 
Ntere ined with this questionnaire., 
Useg in vttelations among the four variables 
oing multiple correlations are give 


in Table 2. In Table 3, you will see that a 
multiple R of .40 was obtained for three- 
month turnover and .33 for twelve-month 
turnover. In both prediction equations, Bio- 
graphical Data has much more weight than 


the other predictors. 


Table 2 
Intercorrelations of Predictor Variables 
(N = 358) 
Clerical Interest Bio- 
Speed Question- graphical 
Tests naire Data 
General Ability Tests .20 —.32 — 4l 
Clerical Speed Tests —.04 —.10 
Interest Questionnaire 39 


In order to determine the practical useful- 
ness of the three-month turnover equation, 
we examined the data to see what would have 
happened if, at the time of employment, we 
had rejected the 35 girls out of the total 
group of 358 who had the lowest scores on 


Table 3 


Multiple Correlation Data 


Multiple 
Point 
Biserial 


Turnover Group R Beta Weights* 


3 Month Turnover .40 Biographical Data 1 


2 
General Ability Tests 5 
Clerical Speed Tests 3 
Interest Questionnaire 1 
.33 Biographical Data 
General Ability Tests 
Clerical Speed Tests 
Interest Questionnaire 


12 Month Turnover 
—4 
3 
2 


* High positive score indicates that individual is 
likely to stay- 


340 


Table 4 


Effectiveness of Three-Month Turnover Battery: 
Actual Behavior of the 35 Girls with Lowest 
Scores as Compared with Re- 
mainder of Sample 


Left in Stayed 
Less than 3 Months 
3 Months or More Total 
Accepted 42 281 323 
Rejected 23 12 35 
Total 65 293 358 
the three-month turnover battery. Under 


present labor market conditions we would not 
want to reject many more than this. We 
found that if we had rejected these 35 girls 
we would have rejected 23 girls who would 
leave in three months or less and only 12 girls 
who would stay longer than three months. 
This means that we would have rejected 36% 
of the total group who would leave in three 
months and we would have rejected only 4% 
of those who would stay longer than three 
months. Thus it appears that we can use our 
turnover equations to screen out a substantial 
percentage of girls who would quit the Com- 
pany very quickly and would not justify their 
training expense, and at the same time only 
lose a small percentage of the girls who be- 
come useful long time employees. 


Summary 


As a summary of the findings and implica- 
tions of this study we would like to make the 
following points. 

1. We can predict 
newly hired girls for 
moderately well using a 
graphical Data, an Interest Qi 
General Ability Tests, and CI 
Tests. Biographical Data is the is 
dictor. The other measures 1 et ak 
slightly the effectiveness of Prediction as si. 
mated by multiple correlation. i 

2. We can predict turnover 
leave in less than three month. 
for girls who leave in less than t 
As you might infer from this, 
dict four to twelve month tur 
well as one to three month 
possible explanation of this j 


quick turnover among 
routine clerical jobs 


uestionnaire, 
erical Speed 


for girls who 
s better than 
welve months, 
we cannot pre- 
nover nearly as 
turnover, One 
S that girls who 


combination of Bio-- 


Philip H. Kriedt and Marguerite S. Gadel 


leave very quickly are more definitely un- 
suited for their jobs than those who leave 
later and therefore their turnover is more 
predictable. Another possible explanation 
is that a very high proportion of the three- 
month turnover group go on to college and 
this kind of turnover may be especially pre- 
dictable. 5 

3. The use of General Ability tests with 
negative weights in selecting girls who will be 
good turnover risks does not conflict with our 
aptitude batteries used to predict job per- 
formance on beginning assignments, since the 
valid predictors for most of those jobs are 
tests of clerical ability rather than the Arith- 
metic Reasoning and Vocabulary tests. 

4. Textbooks in industrial psychology (i, 
p. 248, 2, pp. 313-314, 4, pp. 89, 97) fre- 
quently stress the negative relationship Þe- 
tween intelligence and the likelihood of 4 
person staying on a routine clerical job, 2” 
recommend the use of upper critical scores 
on intelligence tests for selecting personne 
for such jobs. While we did find the same 
negative relationship, it is interesting tO vi 
that in this study other factors such as fam! y 
and educational background and interests 2” 
aspirations tend to be more important than 
intelligence, 

For our purposes we do not think it nee 
sary or desirable to use an upper critical sco! 
on intelligence. General ability scores 3" 
related to success on most of our higher eve 
jobs, and in order to have girls with poe 
tiality for advancement it is necessary tO na 
a number with high general ability- PA 
tunately our research indicates we ca? ye 
girls with such ability who will be fairly a 2 
fe risks as well as good perform on 
ie ning jobs if we screen them carefu Jeri- 

&raphical and interest measures, and © 
cal aptitude tests, 


Received November 14, 1952, 


References 


f stint 
Aptitudes and aptitude a 


1- Bingham, W, y, 


2 Haws York: Harper, 1937, yebo! 
a < Principles of employment $ j 
acp EH 1942. pe 

a. : Harper ant 
Jurgensen, C, p, Selecte, pias hich infe, 
job ieee J. appl. Psychol, 49475 ji 
gee S P 

Tiffin, Ts Industrial psychology. New. 


Prentice-Hall, 1947, 


—— ae 


i 


f 

4 
b 
th 
tob 
d 

i 

2) h 

t 
V] 

i 


APPLIED PSYCHOLOGY 


Tae Jourxat or 
1953 


Vol. 37, No, 5, 


cone common way of presenting evidence 
y ig the validity ofa selective device is 
is coeff of the validity coefficient. When 
o be sae i. is high then the device is said 
‘ates for ul as a means for evaluating candi- 
is saiq Hei and when it is low the device 
( e ineffectual. Taylor and Russell 


$, : . . 
o a shown that this notion is too simple 


Bai 
ic 


ns a describe the situation, since 
€ also ne from the use of a selective de- 
cae be a function of the proportions 
alidity i ee and rejected. When the 
Number i ficient is low and the ratio of the 
®Pplicants Persons selected to the number of 
tl an whe is low, the gains may be greater 
‘On rati n the validity is high and the selec- 
A a also high. 
evaluat and Russell’s approach has been to 
Selection the effectiveness of the results of 
Persons ìn terms of the proportion of selected 
Job, whe turn out to be successful on the 
Criterion at is, some cut-off point is set on the 
Seed thi and all individuals who meet or ex- 
While S critical point are deemed successful, 
Cessfy) Ose falling below are termed unsuc- 
®Xceegi n many situations this approach is 
Btam U8ly useful. Thus, in a training pro- 
€re a specified proportion of persons 
© passed, knowing the validity of 4 
the proportion of persons who will 
, an estimate can be made of the 
Of those selected who will pass the 


osram, 
a Situations, however, this is not the 
et desired. Rather, what is wante 
estimate of the proficiency of those 
Us ~ oe are measured by some = 
d,i, © Thus the question might 7 
Ven „< test of known validity is used an 
«portion of candidates is selected on 


si 
S of their scores, how will the output 
341 


n 
Mfor 


Per Cent Increase in Proficiency Resulting from Use of Selective 
Devices 


Clarence W. Brown and Edwin E. Ghiselli 


University of California, Berkeley 


of the selected workers compare with that of 
the unselected workers." If the average pro- 
duction’ of selected workers is not much 
greater than that of unselected workers, then 
the test will not be worthwhile even though it 
possesses high validity. Furthermore, having 
an estimate of the potential proficiency of the 
selected workers will make it possible to im- 
prove the planning of production schedules. 
Suppose, for example, it were desired to place 
on a particular job persons whose average pro- 
duction is a given amount. Knowing the va- 
lidity of the test, the proportion to be selected 
to achieve a certain production schedule could 
be determined. 

Jarrett has recently considered this prob- 
lem and has developed a formulation which 
permits the appropriate estimates to be made 
(dl). As with the Taylor-Russell approach 
normal linear correlations are assumed. The 
data necessary to estimate gains in proficiency 
from use of a selective device are the validity 
coefficient, the proportion of cases to be se- 
lected, and if per cent gains are to be esti- 
mated, the mean and standard deviation of 
cores of unselected cases. 

Table 1 is the basic table that has been de- 
veloped from Jarrett’s formulation. This 
table gives the mean of the standard criterion 
scores of the selected cases in relation to the 
validity and the selection ratio. The basic 
distribution of standard scores is of the un- 
selected cases, and has a mean of zero and a 
standard deviation of unity. For example, 
suppose the validity of the selective device is 
50 and the 25% highest scoring candidates 
are selected, then the mean criterion score of 


the criterion S 


used in this paper the term “unselected” will 
he same meaning as given in Taylor and Rus- 
and Jarrett’s (1) discussions. It will refer 
mbers of that population of individuals 
‘no apply for the job in question and who— when 
individuals are needed for the job in question— 
would have been put to work without further regard 
for their qualifications before the testing program 


was jnitiated.” 


1 As 
have t 
sell’s (2) 
to “the me 


Clarence W. Brown and Edwin E. Ghiselli 
342 
Table 1 
i idity a lection Ratio 
Criterion Score of Selected Cases in Relation to Validity and the Selec 
Mean Standard Criterio! 
Validity Coefficient a 
5 75 80 85 - 
15 .20 .25 .30 35 40 45 50 55 60 65 .70 .75 .80 og 
105: <10 15 20 25 « d ; : sai - 
.00 52 .62 .73 .83 .94 1.04 1.14 1.25 1.35 1.46 1.56 an a ee 116 
we ae os = a 33.62 70 79 88 97 105 144 123 132 i i Ls 
13 23 31 39 46 54 62 70 7 5 01 1.08 1.16 1.24 1.32 1. a 
T ous 54 .62 .70 .77 .85 931. 6 ee 0 
m oom u a as a $n 7 mos es on as tan tap 1a? 
13 19 25 32 38 44 st 7 70 .76 .82 89 95 1.01 1.08 1. tag 
ie eee 5 38 44 51 57 63 .70 .76 82 | 5 oe 
3 ne 00 .06 .13 19 .25 32. 8 44. i ee oe AE OL T i 110 1 
B25 (ol 7 .23 29 35 40 46 52 58 , k 8 o ETN e 
w 30 00 06 12 aT . ` Tere. eee = ; ET 0 
2 35.00 .05 .11 16 21.26 id punagen a H = s ? s 
E 40 .00 .05 .10 .15 .19 .24 .29 34. FEE EEEE T 
E 45 .00 .04 .09 .13 .18 .22 .26 a n = bs i: Tae ae z 5 
3 = 0 4 oF is a8 2 35 29 32 36 40 43 47 .50 54 .58 Ol Oo ‘ol 64 
S 55 00 04 07 11 14 18 22 25. = . fA 4 3 4 @ gf a 
Š 6 00 03 06 10 13 16 19 23 26 2 a 35 22 4s TEE Sos 
Ee 15 17 20 2 25 27 30 32 35 37 40 42 45 i 42 
ip nuaran n 3 45 rr 19 21 23 25 27 30 32 33 36 38 33 35 
O n 06 .08 .11 .13 .15 17. f 23 .25 : 30 2 i R s 
E go 00 2 oi qo 21 22 2s 96 28 á E 2 2 
oe aaa i A i i is i AG 18 19 0 aa 2 Ai ab 20 
85 .00 01 .03 04 05 .07 . AD: 0 : K 3 6 18 19 20 a2 28 28 a 
00 .01 .02 03 04 05 06 .07 .08 09 10 11 12 13 ` Toi o o aa 
D eee ee O4 04 05 05 06 07 07 08 08 o9 i 
3 a a! al 


the selected cases would be 63 standard devia- 
tions above the mean criterion score of the 
unselected cases. By reversing signs, the 
mean criterion score of rejected cases can also 
be estimated. In the case just given, the 
mean criterion score of the rejected 75% of 
cases would be .21 standard deviations below 
the mean of unselected cases. 

It is apparent from Table 1 that the smaller 
the selection ratio is the greater will be the 
mean criterion performance of the selected 
cases. Reduction in the selection ratio re- 
sults in an increase in mean criterio 
the relationship being positively ac 
the greatest increase in rate of gain 
with selection rates smaller than ab 


n scores, 
celerated, 
occurring 


out 20% 
to 30%. Similarly, as validity increases there 
is an increase in the mean criterion score of 


the selected cases. In this instanc 
it will be noted that gains are dire 
tional to increase in validity. ; 

In many cases the interest wil 
the standard criterion scores of 
group but rather in raw crite 
Knowing the mean standard sco 
lected group, and the mean and 


e, however, 
ctly propor- 


l not be in 
the selecteq 
rion scores, 
re of the se- 
Standard de- 


ired 

sire 
viation of the unselected group, the i 
transformation, of course, can easily here the 
Thus in the case already given W ses was 
mean standard score of the selected ca of the 
-63, if the mean and standard sie) cases 
Faw criterion scores of the agen ra 
Were 50 and 10 respectively, the eo d be 
Criterion score of the selected cases ie profi- 
56.3. The per cent improvement vould be 
ciency through selection, therefore, W 
12.6 


The a 
formed 
and are 
An exam 
read. 
of .50 
person, 


n per 
Ppropriate calculations have aM 
for various values of the "Figure A 
Presented graphically in | chart 
ple will illustrate how this City 
UPpose we have a test with a 20% a 
and we are planning to select ratio ° 
S earning highest scores, the of tte 
the standard deviation to the mean rou? 
SAW criterion scores of the unselecte nt 


er 
being .2, Ocating the value of the P 
Selected, th 


of the cha 


~~ Dinan- P eee 


Per Cent Increase in Proficiency from Use of Selective Devices 


line representing the o/M ratio of .2. Per 
cent improvement is determined from the 
placement of this point in the series of curves; 
in the present case this value would be inter- 
polated as approximately 14%. The mean 
Standard score of the selected group, as also 
tead from the center column of the chart 
pees criterion scores of selected cases), 
is .7, 

The chart, of course, can also be read in the 
reverse direction. Suppose the ratio of the 


Standard Criterion Scores 
of Selected Cases 


343 


standard deviation to the mean of the raw 
criterion scores of the unselected cases is .25 
and it is desired to improve criterion perform- 
ance by 20% through selection of personnel. 
Locating the point at the intersection of the 
vertical line for a o/M of .25 and the curve 
for 20% improvement, following a line hori- 
zontally will give various values of validity 
and of selection ratio that will produce the 
desired result. If as many as 50% of appli- 
cants are to be selected, then test validity 


%, Criterion Scores 


of Unselected Cases 
2 Al 


1335 


[ } 
J | 


40 
per Cent Selected 


60 


Fic. 1. Interrelationships 
variation in criterion perform 
through selection. 


ance, 


among the 
and 


|_| O|_—a r E T 
o y A A èA Sf 
Ratio of Best (+2.50) to 
Poorest (~ 2.50) Worker 
selection ratio, test validity, relative 
per cent improvement in proficiency 


344 Clarence W. Brown and Edwin E. Ghiselli 


would have to be perfect (1.00). With a 
more reasonable validity of .40, only the best 
5% could be selected. 

It is apparent from Figure 1 that as the un- 
selected workers become more homogeneous 
in their criterion performance, that is, as the 
value of o/M decreases, the smaller will be 
the gain from the selection device. For ex- 
ample, with a validity of .50 and a selection 
ratio of 10%, if o/M is .3 then improvement 
will be of the order of 27%. However, if the 
o/M is .05 then the per cent improvement will 
only be about 5%. Probably the limiting 
case for heterogeneity of criterion perform- 
ance can be taken as a o/M of .33, the stand- 
ard deviation being one third the magnitude 
of the mean. Since sometimes heterogeneity 
of criterion performance is expressed in terms 
of the ratio of the output of the best to that 
of the poorest worker, an appropriate scale for 
such values is given on the chart at the foot 
of the right half of the figure. Since values 
here must be chosen arbitrarily, performance 
of the best worker is taken as being + 2.5 
standard deviations in the distribution of cri- 
terion scores and the poorest as — 2.5 stand- 
ard deviations. 


The picture of the value of selection as 


given by this approach is by no means too 
favorable. A validity of .50 is about as high 
as can be expected in most instances and 
seldom can a selection ratio be less than 10%. 
A generous value of c/M would be .25 (ratio 
of best to poorest worker being 4 to 1). For 
these values it will be seen from Figure 1 that 
the expected improvement in criterion per- 
formance is only 23%. In most cases Va- 
lidity will be somewhat lower, the selection 
ratio higher, and criterion performance more 
homogeneous. Under optimal conditions, 
therefore, improvement in productivity as & 
result of a selection program can be consid- 
ered to approximate 25%. 


Received November 10, 1952. 


References 


1. Jarrett, R. F. Per cent increase in output of 5è- 
lected personnel as an index of test efficiency: 
J. appl Psychol., 1948, 32, 135-145. i 

2. Taylor, H. C. and Russell, J. T. The relation: 
ship of validity coefficients to the practica 
effectiveness of tests in selection: tables a7 


ertadan: J. appl. Psychol., 1939, 23, 565- 
8. 


THE JOURNAL A y iai s 
Vol. 37, No. te PsycHoLocy 


Efficiency of Tests When Used to Select the Better of Two 
Workers” 


Laurence Kashdan 


Civilian Perfonnel Research Branch, United States Air Force, 


| = w st personnel selection situations where 
about “i me other sources of information 
Ciding icants are considered as well in de- 

Ost bay to accept or reject people. 
Psycholo ers of textbooks in employment 
just ied recommend the use of tests in 
Personal q ay—as supplements to other valid 
Matter nt han Used in this way any test, no 
erably ag at its validity, may vary consid- 
Company? to the role it will play in a given 
Panies, y's program, or among different com- 
ce weight should a personnel of- 
When oth upon the test scores of two people 
able? en data about them are also avail- 
| actuari] ER low validity coefficients become 
. i Y significant in the course of many de- 
. 


Clsion 
S based upon test scores.* The Taylor- 


The 
X, Mosel thor is very grateful to Professor James 
. tons with o George Washington University, discus- 
Paper, Whom suggested the main concepts of this 
——_Roulger op §, M50 a Pleasure to thank De. John R. 
Muscript, S office for his helpful review of the 


Proin j s 
"entice Hral, iigustriat psychology. New York: 


| l Table 1 
e Better of Two Workers on th 
Their Test Scores 


The Probability of Selecting the F 
Difference 1n 


Washington, D. C. 


Russell tables * specify the efficiency of selec- 
tion using tests whose validity coefficients may 
vary between 0 and 1, depending upon vari- 
ous existing employment conditions. How- 
ever, these tables assume that all decisions 
will be made over the long term on the basis 
of test scores alone. 

What is needed is a guide which will help a 
personnel officer decide about the risk he may 
be assuming in relying entirely upon the 
achievement in a test by two or more appli- 
cants for a position—or in choosing to ignore 
relative scores in favor of non-test considera- 
tions. It is possible to specify the probability 
that a test has correctly ranked two people in 
terms of a criterion of job performance when 
the difference in their standard scores on the 
test is known. If, the assumptions for com- 
puting the product-moment coefficient of cor- 
relation had been properly met in computing 
the validity coefficient, and all scores are now 

3 Taylor, H. C. and Russell, J. T. The relation- 


ship of validity coefficients to the practical effec- 
tiveness of tests in selection: discussion and tables. 


appl, Psychol, 1939, 23, 565-518. 


c Basis of the 


i tence Between Validity Coefficient of the Test (r) 

Stang, St Scores in = 5 7 3 3 

| Score Units 1 2 3 A d 5 : ` s 
ry 5 zo 50 50 .50 50 
F E. = = 3 55 S7 Do G 
50 50 = ‘å 56 58 61 64 68 ae 
ye 3 5 or: 62 66 70 76 86 
l 1.00 -52 3o g a S 70 -75 83 93 
| 1.25 -53 56 ‘61 65 .70 75 81 88 97 
1.50 ge 8 4 @ 8 85 92 99 
l 1.75 4 Pe mno a 89 95 99 
200 35 .60 i 73 19 84 92 97 1.00 
2.25 .56 61 pA 5 82 88 94 98 1.00 
2.50 56 63 oi 78 85 1 .96 99 1.00 
275 57 a 1 i 8 o «97 99 1.00 
3:00 x K 7 «2 8 94 98 100 100 


345 


346 


expressed as standard scores, then we can 
derive Table 1.* This gives the probability 
that for any two subjects selected at random 
from among those taking the test, the one of 
them who has earned the higher score in the 
test will be the better worker—in terms of the 
criterion of validity for that test. 

As an example of how Table 1 would be 
applied in a practical situation consider the 
following data: 


A achieves a standard score of .95 in a 
test; B earns a score of .20; the validity 
coefficient of the test is .7; what is the 
probability that A will prove to be a better 
worker than B? 


4See: Jenkins, W. L. An index of selective effi- 
ciency (S) for evaluating a selection plan. J. appl. 
Psychol., 1953, 37, 78, for a comparable treatment 
which disregards test score difference. 


Laurence Kashdan 


Since the difference between the scores of 
the two men is .75 standard score units, the 
first column of the table is entered at .75, and 
moving over to the column for r= .7 the 
tabled probability is given as .70. This means 
that on the basis of te$t score alone there are 
about 7 chances in 10 that A will turn out 
to be the better worker. Or, viewed con- 
versely, if the personnel officer should decide 
to disregard their relative achievement on the 
test and select B over A for the job, there 
would be only about 3 chances in 10 that his 
decision will prove to be correct. 

Table 1 is an effective way to illustrate the 
meaning of a validity coefficient to personne’ 
people in terms of their own operations. 


Received July 6, 1953. 
Early publication, 


THE Jourxat 
NAL OF AP s > 
Vol. 37, No. 5, ae PsycHoLocy 


Ratings of Candidates for Promotion by Co-workers and 
Supervisors 


Doris Springer 


Supervisory Selection Board, North American Aviation, Inc., 


el pee purpose of this study was to 

nel and be ings made by supervisory person- 

motion Ae aa on candidates for pro- 

0 the le man jobs. Specifically, answers 
ng questions were sought: 


a : 
D n what extent do supervisory per- 
co-work ag ; ir rat 
of workers? ers agree in their ratings 
2 
ta does the extent of this agreement 
ers of s with (a) the extent to which mem- 
(b) the Upervision agree with each other, and 
c tay to which co-workers agree with 
er in the ratings given workers? 


In 
~ analyzj 
ti lyzing the data to answer these ques- 


S, an 
ns. Wers were suggested for other ques- 
» Such as: 


. (3 

in = nd do judgments on different items 
(4) Is me form compare with each other? 

ten, to lige any evidence that supervisors 

do their ate candidates lower or higher than 

. (5) Satine cial 

dividu ow do the totals of the ratings on the 

raty CYAl characteristics compare with the 


tings 
oO a ao 4 
Promotions S suitability of the candidate for 


tio 


Th 

det mi oDlem is of practical importance in 
visors a ing the reliability of ratings by supé™ 
‘Drop co-workers and in arriving at the 
by the, Cte Weights to be given ratings made 
i or in an over-all evaluation of candi- 
ition co Promotion. The provision in many 
w Jobs Ntracts which states that promotions 
th Covered by the contract are to be 
Y seniority only when ability: ski 
€rformance are equal draws atten- 
inin, need for devising techniques for 

8 workers’ suitability for Promo” 
be acceptable 


Job 
t pi 


f these tech- 
The pres- 
347 


S in ra analyze the results 0 
eir actual application. 


Los Angeles, California 


ent study provides data on two of these 
techniques, namely, co-worker ratings and 
supervisory ratings. 

From a theoretical standpoint, the study, 
contributes some data on the attitudes of 
two distinct groups in the economic structure 
and on the relative homogeneity of thought 
of these two groups with respect to one aspect 
of their work environment. An accumulation 
of such data will enable us at some future 
e to arrive at a psychological and socio- 


timi 

logical understanding of the two groups which 
will be invaluable to the industrial psy- 
chologist. 


Ratings Studied 


This study is based on the ratings made on 
100 men who were candidates for leadman * 
jobs in 14 different departments of the manu- 
facturing division of a major aircraft com- 
pany. The ratings were made as a regular 
phase of the company’s supervisory selection 
program in which each candidate is evalu- 
ated on the basis of his work experience, edu- 
cation, work record, and scores on mental 
ability, shop math, and job knowledge tests, 
in addition to the ratings. Ratings are made 
by two supervisors, representing two levels of 
supervision over the candidate, and by three 
co-workers who work closely with the can- 
didate but who are not eligible to be candi- 
dates for the leadman job. The ratings 
analyzed here are the ratings made by two 
members of supervision and two of three co- 
workers (selected at random) for each of the 
100 candidates. A total of 68 different as- 
sistant foremen and foremen made the super- 
visory ratings. The exact number of co- 
workers participating cannot be reported since 
these rating forms were not signed, but the 
number was probably between 150 and 175, 


1 At North American Aviation, Inc., a leadman di- 
group of five to ten men. The job is covered 


rects @ 
n contract. 


by unio 


348 


A worker ordinarily rated only one worker 
for any one job opening and rarely did a job 
opening occur in the same group during the 
period studied. 

The two rating forms used were the “be- 
havior sample” type in which five gradations 
from very poor to outstanding were described 
for each characteristic. The form used by 
the co-workers consisted of five factors; 
namely, job knowledge, job performance, co- 
operation, ability to train others, and suita- 
bility for promotion to leadman. The form 
used by supervisory personnel consisted of 
eight factors; namely, job knowledge, quality 
of work done, quantity of work done, co- 
operation, drive, observing rules, personal 
appearance and manner, and suitability for 
promotion to leadman. The raters were in- 
structed to check the one statement for each 
factor which best described the candidate. 
The ratings were made independently. For 
purposes of this report, the five intervals have 
been assigned values of 1 through 5, from 
lowest to highest. 


Statistical Method 


The degree of relationship between the 
variables studied has been measured by the 
product moment coefficient of correlation, 
It was believed that the nature of the data 
justified the use of this technique because 
the series were more nearly continuous than 
discrete and more nearly quantitative than 
qualitative. 

When a difference is described in the report 
as significant, that difference is so large that 
it could be expected by chance not more than 
once in 100 times (P $0.01), 


Results 


Relationship between ratings made by mem- 
bers of supervision and co-workers. The co- 
efficients of correlation between the Tatings 
made by one member of supervision and one 
co-worker for each candidate on items com- 
mon to both rating scales are shown in Table 
1. The single supervisory rating was chosen 
at random from the two ratings made and the 
single co-worker rating was chosen at random 
from the three ratings made. Super 


: E Visor E 
ings on quality of job performance and a 


Doris Springer 


tity of work done have been compared with 
the single rating on job performance given by 
co-workers. 

All of the correlations are rather low, rang- 
ing from .15 to .39; however, only the lowest 
coefficient is not significantly greater than 
zero. There is greatest agreement on the 
over-all rating of general fitness for promo- 
tion. The data in Table 1 suggest that co- 


Table 1 


Relationship Between Ratings Made by Members 
of Supervision and Co-workers 


Coeficient of 


Item Rated Correlation 


Job knowledge 15 
Job performance—Quality 25 
Cooperation 29 
Job performance—Quantity 33 
General fitness for promotion 39 


worker and supervisory ratings do not dupli- 
cate each other unnecessarily; and, at ae 
in this respect, the consideration of both type 
of ratings in evaluating candidates for prom?” 
tion seems justified, 

The low degree of agreement between the 
ratings of supervisory personnel and co-work- 
ers indicates that many factors determining 
the ratings of the two groups are either n0 
similar, or are not receiving the same ee 
tive emphasis, Perhaps their standards ° 
judgment, based on differences in scope 4” 
type of experience and present job status, 2% 
ount for the lack of agreement. Their 14t 
a may be determined by observations 0 
i Samples of behavior of the men bel” S 
rated. On the other hand, the discrepancy 
a the ratings found here may be accoun 
Teni] by differences of opinion on W 
work "istics are desired in a leader © 

so SOUP. Research on worker and SU 


groups should be led d tha 
ifferences Eilat (1). has suggeste 


„e data reported show 
differences har = here merely 


and members of 


pat 
A ker 
en the opinions of coa pet 
research is Supervision do exist; ces 


necessary to identify the 50" 
of these differences, a sea 


f the | 


Ratings of Candidates for Promotion by Co-workers and Supervisors 


Relationship between ratings made by pairs 
of co-workers. The coefficients of correlation 
between ratings made by pairs of co-workers 
on the candidates are shown in Table 2. The 
Coefficients indicate an agreement between 


pairs of co-workers which, although greater 


than zero, is moderately low to moderate. 


Table 2 
Relationship Between Ratings Made by 
Pairs of Co-workers 


Coefficient of 


Item Rated Correlation 
Cooperation i 34 
General fitness for promotion 34 
Instruction ability Al 
Job performance Al 
Job knowledge 43 

Total of all items 48 


With one exception, the correlations be- 
tween ratings made by pairs of co-workers 
are higher than the correlations between co- 
Workers and supervisory personnel. There 
'S slightly less agreement among co-workers 

an between co-workers and supervisors on 
Seneral fitness for promotion; however, the 
ifference is not significant. 
rati hen ratings given for all items on the 
taia form are combined, the coefficient ob- 
nif ed is slightly higher (though not sig- 

cantly so) than for any individual item. 
thes coefficients in Table 2 are in line with 
m Se reported in most studies of supervisory 

erit ratings (2, 3, 4). 
th: e greater agreement among co-workers 

an between co-workers and supervisors may 

: €ct more similarity among the former than 
of tween the latter with respect to aiandi 
or isa behavior actually observed, and/ 
Sireq ghd on what characteristics a 

ee leader of the work group. at 18 
foung -2t that only moderate agreeme be 
ftom aj dicates that the co-workers are i 
ttitnge a homogeneous group with resp 
The -tes toward their co-workers. e 
ag, COmparison here may be interpre a 
ensure of the reliability of me 3 
Node Sie es The moderately , acai 

reliability of the ratings ” 


349 


that such ratings should not be used as the 
sole basis for selection and that care must be 
taken in their interpretation. The relatively 
low reliability of co-worker ratings, as com- 
pared with reliability coefficients of other 
types of measures, should be considered in 
deciding on the weight of these ratings in the 
battery of measurements to be used in evalu- 
ating the candidates. 

Relationship between ratings made by pairs 
of supervisory personnel. The coefficients of 
correlation between the ratings made on each 
candidate by two members of supervision are 
shown in Table 3. 

The coefficients, ranging from .56 to .71, 
indicate a fairly high degree of agreement be- 
tween the members of supervision in rating 
workers on all items included in the rating 
scale. The over-all rating, general fitness for 
promotion, showed the highest degree of 
agreement although none of the differences 
between the items are clearly significant. 
The fairly high correlations indicate that 
members of supervision tend to base their rat- 
ings on similar observations of the workers’ 
performance and to judge the various char- 
acteristics according to similar standards. 

All of the coefficients reported in Table 3 
exceed those reported in the previous com- 
parisons and suggest a greater degree of 
agreement among members of supervision 
than among co-workers and between co-work- 
ers and members of supervision. 

If the relationship is interpreted as a meas- 
ure of reliability, then the supervisory ratings 


Table 3 


Relationship Between Ratings Made by 
Pairs of Supervisors 


Coefficient of 


Item Rated Correlation 
Observing rules 56 
Personal appearance 61 
Quality of work ‘61 
Job knowledge 63 
Drive 65 
Quantity of work 66 
Cooperation 67 
General fitness for promotion 71 

Total of all items 66 


350 Doris Springer 
Table 4 
Distributions of Ratings by Supervisors and Co-workers 
Number of Ratings in Interval 
Mean of Standard 

Item Rated and Rater 1 2 3 4 $ Ratings Deviation 
Job knowledge 

Supervisors 1 8 B 8g 3 3.73 83 

Co-workers 1 8 55 63 73 4.00 -92 
Job performance—Quantity* 

Supervisors 0 5 84 68 43 3.74 82 

Co-workers 1 6 44 76 73 4.07 86 
Job performance—Quality 

Supervisors 0 1 56 86 57 4.00 .16 

Co-workers 1 6 44 76 73 4.07 86 
Cooperation 

Supervisors 0 8 90 44 58 3.76 92 

Co-workers 1 10 41 67 81 4.08 94 
General fitness for promotion 

Supervisors 3 35 61 60 41 3.50 1.05 

Co-workers 3 16 50 65 66 3.87 1.01 


* Supervisory ratings on quality of work done and quantit; 


‘ ee y of work bire i wt -worker ratings 
on job performance which included both quality and aiant ork done are compared with co-worke 


have a fairly high degree of reliability. The 
greater consistency of the supervisory ratings 
as compared with co-worker ratings suggests 
that the former are more dependable, 

Comparison of the distributions of ratings 
by members of supervision and co-workers, 
The distributions of the ratings by the 200 
members of supervision and the 200 co-work- 
ers on the items common to both rating forms 
are shown in Table 4. 

The ratings of supervisors tend to be more 
conservative than those of the co-workers, 
This is evident in a comparison of the propor- 
tions of ratings of the two groups which are 
in the highest interval in the rating scale (step 
5). For every characteristic rated a smaller 
proportion of the supervisory ratings is in the 
top interval than is true of co-worker ratings, 
In only one instance is the difference small 
enough to be attributed to chance (for job 
performance-quality, P = -10). 

The tendency of supervisors to give lower 
ratings than co-workers is shown also in a 
comparison of the means of the various items 
rated. In every instance the mean of the 
supervisory ratings is lower than the Mean of 
the co-worker ratings. The differences are 
significant at the 1% level, or better, with the 
exception of job performance-quality, where 
P=.19. 


Very few of the workers were rated in the 
lowest Category by either supervisors or C0- 
workers. Since there had been some prior 
selection of the men (they had been proposed 
for consideration by either members of supe" 
vision or of Industrial Relations), it was €% 
pected that seldom would a candidate be rated 
aS very unsatisfactory in any factor. 
though the frequencies in the second interva 
are higher than in the lowest interval, the 
Second interval is used in fewer than 5% ° 
the ratings except for the over-all rating: 
For the item, general fitness for promotion, 
approximately 18% of the ratings of super 
visory personnel and about 8% of the ratings 
of co-workers are in the next to the lowes 
interval, 


The interval wi ; ] fre- 
vith the highest total ? 

quency is the thir © Agee for 
members of şi 


Workers, ste 
three of the f 


our different factors ee 


Ratings of Candidates for Promotion by Co-workers and Supervisors 


for the relatively low ratings given by mem- 

. bers of supervision as compared with co-work- 
ers. Perhaps the status of supervisory per- 
sonnel results in more realistic, less personal 
ratings. Also, members of supervision have 
had more training in the use of the rating 
form since many of them attend meetings of 
the Supervisory Selection Board. Some of 
them had reviewed the rating forms when the 
forms were being constructed. 

Comparison of the individual items on the 
rating forms. In a comparison of ratings as- 
signed to the various items shown in Table 
4, it appears that the distribution for the final 
Over-all ratings on suitability for promotion 
differs from the distributions on the other 
factors of ratings by both supervisors and co- 
Workers. For example, the mean of the rat- 
ings for this factor is significantly lower than 
the mean of the ratings assigned any of the 
Other items. A greater proportion of the rat- 
ings on this factor are in the two lowest in- 
tervals (below average) than is true of any 
Other factor; however, only in the case of the 
Supervisory ratings are the differences clearly 
Significant. 

_ The differences between the standard devia- 
tions for the ratings given by the two groups 
of raters are not statistically significant. The 
Steatest variation in ratings of both groups is 
found in the ratings on general fitness for 
Promotion. 

f When the final item, suitability for promo- 
tion to leadman, is compared with the total of 
the ratings on all other items in the rating 
forms, the correlations obtained are .85 for 
Supervisors and .85 for co-workers. The co- 
efficients approximate the ones reported in 
Previous studies in which the same type of 
Comparison was made (3, 4)- 


Summary and Conclusions 


A group of 100 men who were candidates 
or promotion to leadman jobs in the manu- 
acturing division of an aircraft company 
Were rated by members of supervision and by 
co-workers. Comparisons were made between 
ratings given each candidate by: (1) a mem- 
er of supervision and a co-worker; (2) two 
Members of supervision: and (3) two co- 
Workers. The following conclusions are based 
On the results of these comparisons: 


351 


1. There is a low, positive degree of rela- 
tionship between the ratings given by super- 
visory personnel and co-workers. 

2. There is a slighly higher degree of agree- 
ment between the ratings of pairs of co- 
workers than between the ratings of mem- 
bers of supervision and co-workers. The cor- 
relations obtained indicate a moderately low 
to moderate statistical reliability for the co- 
worker ratings. 

3. There is a much higher degree of agree- 
ment among the ratings given by members of 
supervision than among ratings given by co- 
workers. The correlations obtained indicate 
a fairly high statistical reliability for the su- 
pervisory ratings. 

4, Supervisory personnel tend to rate the 
men lower than do co-workers on all items 
common to the two rating forms as shown 
by consistently lower mean ratings, by lower 
modal intervals, and by a larger proportion 
of candidates considered below average on 
general fitness for promotion. 

5. Both members of supervision and co- 
workers tend to be somewhat more conserva- 
tive when rating the candidates on the over- 
all item, general fitness for promotion to lead- 
man, than when rating individual charac- 
teristics. 

6. There is a very high degree of relation- 
ship between the total of ratings on all 
separate characteristics and the ratings given 
on the single item, general fitness for pro- 
motion. 


Received July 20, 1953. 
Early publication. 


References 


1. Fleishman, E. A. The measurement of leadership 
attitudes in industry. J. appl. Psychol., 1953, 
37, 153-158. 

. Ghiselli, E. E. The use of the Strong Vocational 
Interest Blank and the Pressey Senior Classifi- 
cation Test in the selection of casualty insur- 
ance agents. J. appl. Psychol., 1942, 26, 793- 
799. 

3. Stead, W. H., Shartle, C. L., et al. 
counseling techniques. 
Book, 1940, pp. 49-72. 

4. Tiffin, J. Industrial psychology. New 
Prentice-Hall, Inc., 1952, pp. 345-346. 

. Williams, S. B. and Leavitt, H. J. Group opinion 
as a prediction of military leadership. J. con- 
sult. Psychol., 1947, 11, 283-291. 


w 


Occupational 
New York: American 


York: 


on 


Tue JOURNAL or APPLIED PSYCHOLOGY 
Vol. 37, No. 5, 1953 


Turnover Factors as Assessed by the Exit Interview 


Frank J. Smith and Willard A. Kerr 
Illinois Institute of Technology 


Many employees in the process of quitting 
their jobs are in a mood to express feeling and 
speak frankly. If the enterprise maintains a 
formal exit interview in which the employee is 
assured that nothing he says will be used 
“against him” in any way, the tendency to- 
ward frankness and even catharsis is strength- 
ened. The exit interviewer thus is in a 
uniquely advantageous position to observe the 
dynamics of the turnover process. Different 
approaches (1, 5, 8, 10, 11) to study of turn- 
over are desirable; undoubtedly avoidable 
turnover differs from one enterprise to an- 
other in qualitative ways because of differing 
organizational climates and the patterns or 
syndromes of reasons for quitting should in 
part be products of these climates, 


Experimental Design 


On the assailable but necessary assumption 
that exit interviewers are adequate media for 
assessing the patterns of turnover, the follow- 
ing research was executed, A brief content 
analysis report! was constructed, requesting 
the exit interviewer to estimate how often in 
five typical interviews each of sixteen topics 
was “mentioned as a reason for leaving.” 

m with a cover letter was sent 
to the exit interviewer in each of 200 different 
nationally representative companies (selected 
randomly from Poor’s), Of these, nineteen 
replied that they did no exit interviewing and 
two were returned unclaimed, Another two 


were returned with verbal explanations but 
without usable quantit 


y ative data. F orty- 
eight properly completed analyses were Te- 
turned and utilized in this research. The 48 


companies are geographically representative 


stitute Order, document No. 405 
uxiliary Publications Ph Paar 
Service, Library of Congress, Washington o icat 
remitting $1.25 for microfilm (i inch hic? 
on standard 35 mm. motion Picture film "or gh 


ptical aid, 


and report an annual exit interview case load 
of 5075. d 

Instrument Reliability. A split-half relia- 
bility coefficient for the content analysis re- 
port on the 48 returns was .81 which became 
-90 when corrected by the Spearman-Brown 
formula. 


Results 


Exit Interview Content Profile. According 
to reports from these 48 companies, each bd 
is mentioned as a reason for leaving as follows 
(per five representative interviews): pay) 
1.89; transportation, 0.81; promotion, on 
working conditions, 0.69; poor health, 0.64; 
job security, 0.54; friction with co-workers; 
0.52; poor housing or excessive rents, 0.59; 
Personal happiness as affected by job ie 
ence, 0.33; ability of supervisor, 0.33; broken 
promises by supervisor, 0.25; confidence = 
management, 0.19; company interest 10 ae 
ployee welfare, 0.15; freedom of community 
tion with higher levels, 0.12; recreation, ona 
method of wage payment, 0.02; other pre 
lems, 1.15. b 

Comparison of Content Profile with A 
Satisfaction Data. Most of the above topics 
are included in a widely used job satisfactio” 
survey form (7). When some typical survey 
results (9) were compared with these &* 
interview content data, it was found that Pa 
was the foremost grievance in both. Workin? 
Conditions also was a major grievance in ee 

Mong other topics, however, the agreem? 
Was moderate or low. ize 

e results just quoted which emphast s 
employee concern about pay may seem 


3 +. true 
contradict some previous research. It is t 
that m 


n 
any researchers (2, 3, 4, 6, 12, 2 r 
others) haye found that wlien employees a 
asked what they consider “most importan st 
in their jobs, they do not put pay as fore™ i 


in importance, Act are P 
: . ually, such results 
im contradiction Yy SHEN. : urvey 


and exit interyj 


ew results, because the 
Portance rankin. $ 


8” studies represent Te 


ji 


Turnover Factors as Assessed by the Exit Interview 


on an entirely different variable. “Factor 
Importance in a job” is not the same thing as 
what the employee is happy or unhappy about 
ina job. The factor importance ranking stud- 
les referred to above are cast in an abstract, 
theoretical frame of reference for the employee 
respondent. They get at his set of philo- 
Sophical values. But the job satisfaction sur- 
vey and exit interview get at something dif- 
ferent: not the importance but the satisfactory 
or unsatisfactory condition of each factor in 
current work experience. Generalizing from 
all these related researches, we might suggest 
that employees in general concede that pay is 
Not the most important factor in a job, but 


353 


they nevertheless feel that it represents a 
foremost grievance factor. This generaliza- 
tion is bolstered by a non-attitudinal turnover 
study, revealing pay as a foremost objective 
correlate of turnover (8). 

Comparison of Exit Content Profile with 
Routine Personnel Counseling Profile. A con- 
tent analysis report form substantially iden- 
tical to the one used for exit data except that 
it was focused upon routine personnel coun- 
seling was constructed and sent to 39 com- 
panies which at some time had had counseling 
programs. The eight firms which finally co- 
operated returned reports from 22 personnel 
counselors who reported serving an annual 


TRANSPORTATION 
PROBLEM 


Fic. 1. 


INADEQUATE PAY 
POOR HOUSING OR 
EXCESSIVE RENTS 
PROMOTION PROBLEM 
FREEDOM 


OF COMMUNICATION WITH 
HIGHER LEVELS 


‘ta ABILITY OF SUPERVISOR iA 


CONFIDENCE IN MANAGEMENT 


COMPANY'S INTEREST IN EMPLOYEE WELFARE 


PERSONAL HAPPINESS AS AFFECTED 
BY JOB EXPERIENCE 


BROKEN PROMISES BY SUPERVISORS 


FRICTION WITH 
COWORKERS 


JOB SECURITY 
WORKING CONDITIONS 


S SN 


io POOR HEALTH 


The exit interview content patterns. 


354 


total load of approximately 30,000 cases. A 
sixteen topic profile was computed on these 
22 reports. The split-half reliability was .85 
which corrected to .92. When these profile 
values were correlated with the similarly- 
derived exit interview profile values rho was 
found to be .74. Apparently the content of 
routine personnel counseling interviews is very 
similar to the content of the exit interview. 

Pattern Structure of Exit Interview Con- 
tent. Intercorrelations were computed among 
the sixteen exit interview content topic fre- 
quencies for the 48 companies. A simple 
linkage-type cluster analysis of the resulting 
matrix was then performed in an effort to 
isolate the most characteristic exit patterns or 
climates. Results of this analysis are shown 
schematically in Figure 1. 

Perhaps the most conspicuous single out- 
come of the analysis is the presence in four 
of the five clusters of “ability of the super- 
visor.” Even “poor health,” which correlates 
with nothing else as a reason for quitting, 
correlates .75 with (lack of) “ability of super- 
visor.” The question of whether the super- 
visor is a convenient Scapegoat for the em- 
ployee in poor health or whether he bears a 
causal psychosomatic relationship to employee 
poor health is not answered in these data. 
The triad in which the supervisor also figures 
along with “transportation” grievances and 
lack of “confidence in management” suggests 
an interesting pattern in some companies. It 
appears probable that employees living far 
away from the plant have more transportation 
difficulties and therefore are tardy or absent 
more frequently than other personnel. The 
supervisor (of this pattern) categorizes men- 
tally and orients to these employees as bein 
of the “less dependable, tardy, absentee type 
Gradually the distant-living employee pers 
ceives this apparent untrusting attitude in 
the supervisor, and he develops a reciprocal 
lack of confidence in the management. Even- 
tually, according to this plausible interpreta- 
tion, he ends up in the exit interview com- 
plaining about transportation, the ability of 
the supervisor, and lack of confidence in man- 
agement. 

Three other factors, climate 


S, Or syndrome: 
appear. A general kuman r g 


elations pa ttern 


Frank J. Smith and Willard A. Kerr 


emphasizes concern with broken promises by 
supervision, friction with co-workers, com- 
pany interest in employee welfare, freedom 
of communication, promotion, ability of su- 
pervisor, confidence in management, and job 
effect on personal happiness. A security pat- 
tern emphasizes complaints about job security, 
working conditions, confidence in manage- 
ment, ability of supervisor, interest in em- 
ployee welfare, broken promises by supet- 
visor, job effect on personal happiness, an 
friction with co-workers. 

An upgrade pattern is evident in a — 
ency toward simultaneous complaint abou 
promotion, pay, and freedom of communi- 
cation. Poor housing complaint is also 1 
cluded, correlating negatively with pay ae 
plaint but positively with grievances abou 
promotion and freedom of communication 
The “rush to get ahead” syndrome is apparent 
here. Promotion and access to “higher ups 
are considered important—along with pay ° 
satisfying family housing. They do not Fa 
that the present employment permits them PA 
meet their pay and prestige goals fast enoug : 

In interpretation it should be noted thar 
these patterns may, in fact, represent the turn 
over-inducing climates operating in parts ne 
the 48 enterprises reported on. Untori 
nately, to an unknown degree, it is possib H 
that the patterns may be influenced by man 
of perceptual reference of the interviewe 
themselves. Even allowing for some commo? 
sets among interviewers, it still seems a 
able that their reports must also have been # 
fluenced by what they have seen, heard, 4 


a K n e ea owne" 
sensed in their daily work of exit interview?’ 


Summary 

Forty-eight exit interviewers in as mee 
companies supplied topical analyses of ply 
interview content. These data, ostens! re 
Products of differing turnover climates, as 
Summarized by topic, intercorrelated; , ns: 
analyzed to suggest the following conclus S a 
ay grievances were mentioned twice m 
frequently as any other single topic of rere 
plaint. Next in order of complaint Wis, 
transportation, promotion, working con! wl 
poor health, i hous 
the job 


al 


job security, co-workers, nage 
» Supervisor, confidence in ma 


Turnover Factors as Assessed by the Exit Interview 355 


ment, interest in employee welfare, freedom 
of communication with higher levels, recrea- 
tion, and method of wage payment. 

2. The relatively heavy emphasis upon pay 
and working conditions agrees with the heavy 
emphasis assigned by regular employees them- 
Selves in job satisfaction surveys, and with 
turnover correlates, but disagrees with “factor 
portance ranking” studies. Otherwise, exit 
Interview topic emphasis agrees only moder- 
Sheed with “per cent dissatisfied” on job satis- 
action surveys of non-quitting personnel. 

3. When 22 regular (not exit) personnel 
Pees submitted reports of content of 

eir routine counseling, their mean profile of 
topic frequencies was found to correlate (rho) 
aa with the mean profile obtained on the 48 
in interviewers. Apparently there is much 

Common among the frustrations expressed 

Y employees who are quitting and by em- 
Ployees still on the job. 
os A cluster analysis of exit topic frequency 
Teorrelations was performed with the fol- 
relat climatic patterns resulting: a human 
$ ions syndrome; a security syndrome; an 
caie syndrome; a transportation-confi- 
Nee triad; and an unnamed duad. 


Ress 
“ceived November 24, 1952. 


w 


uw 


. Blum, M. L. and Russ, J. J- 


. Chant, S. M. 


References 


. Baruch, Dorothy. Why they terminate. J. con- 


sult. Psychol., 1944, 8, 35-46. 
A study of em- 
ployee attitude toward various incentives. 
Personnel, 1942, 19, 438-444. 
Measuring the factors that make 


a job interesting. Personnel J., 1932, 11, 1-4. 


. Hersey, R. B. Psychology of workers. Person- 


nel J., 1936, 14, 291-296. 


. Ho, C. J. Health and labor turnover in a de- 


partment store. Personnel J., 1930, 9, 216- 


221. 


. Jurgensen, C. E. Selected factors which influ- 


ence job preferences. J. appl. Psychol., 1947, 
31, 553-564. 


. Kerr, W. A. The tear ballot for industry. Chi- 


cago, 90: Psychometric Affiliates, 1944. 
Kerr, W. A. Labor turnover and its correlates. 
J. appl. Psychol., 1947, 31, 366-371. 


. Kerr, W. A. and Cramer, R. J. Age group and 


attrition morale phenomena in industrially em- 
ployed males. Paper presented to Midwestern 
Psychological Association, Detroit, Mich., May 


5, 1950. 


. Miller, L. R. Why employees leave. Personnel 


J., 1944, 23, 111-119. 


. Tiffin, J., Parker, B. T., and Habersat, R. W. 


The analyses of personnel data in relation to 
turnover on a factory job. J. appl. Psychol., 
1947, 31, 615-616. 


. Wyatt, S., Langdon, J. N., and Stock, F. G. 


Fatigue and boredom in repetitive work. 
British Industrial Health Research Board Re- 
port No. 77, 1937, 43-46. 


Tue JOURNAL or APPLIED PsycHotocy 
Vol. 37, No. 5, 1953 


The Quartile Difference Method of Item Selection * 


Norman Friedman? 


Occupational Research Center, Purdue University 


It is customary in cases where personal 
data are used as predictors of various criteria 
of job performance to use as predictors all of 
the items which show a significant relation- 
ship with the criterion. The drawback to this 
mode of operation lies in the possibility of 
some of the included items contributing more 
error in prediction than they do to actual va- 
lidity. 

This phenomenon can best be explained in 
terms of item and criterion variance. A hy- 
pothetical illustration will be given in terms 
of two items. Item 1 shares 20 per cent of 
its variance in common with the criterion, in- 
cludes 50 per cent specific variance and 30 
per cent error variance. The second item 
shares 15 per cent of its variance with the 
criterion, but 12 of these 15 per cent are in 
common with the 20 per cent shared by item 
1. Further, 50: per cent of the variance for 
item 2 is specific, and 35 per cent is error 
variance. By adding item 2 to item 1, three 
per cent more of the criterion variance is ac- 
counted for; but at the same time 35 per cent 
additional error variance is introduced, Thus, 
adding item 2 would result in shrinking the 
validity of item 1, 

In a specific situation, then, the problem is 
one of selecting a number of items from a 
pool of items so that the selected items give 
a maximum relationship with the criterion, 
The Wherry-Doolittle Technique * achieves 
this goal for data where item validities and 


itle “Personal Data as 
havior of Telephone Op- 


? The author wishes to express his gratit: 
General Telephone Company of Michigan i es ms 
Norris, President, whose cooperation » this stage 

this r » a special word of th i 
due Dr. Melvin Tieszen, formerly Personnel ote 
of the Company and now affiliated 2 
and Hamilton, New York. 


3 Stead, W. H., Shartle, C, L., et al. 
counseling techniques. New York: A 


Occupational 
Company, 1940, pp. 253-255. 


merican Book 


356 


inter-relationships can be expressed in a 
of coefficients of correlation. For categoric & 
data, however, where item versus wee. 
relationships are expressed in 2 X 2, 2 X di 
2 X k contingency tables, neither item vali r 
ties nor item inter-relationships can be ex 
pressed in correlational terms (except for m 
recorded in 2 X 2 tables). Regardless of t ri 
data format, the problem of shrinkage “es 
mains the same. The quartile en, 
method proposed here does essentially a 
categorical data what the Wherry-Dooli ta 
Technique achieves with correlational si d 

The mechanics of this method will be il ra 
trated with four application blank items a 
were found to be related to the tenure of te e 
phone operators at the 10 per cent significan 4 
level or better on the basis of analysis e 1 
primary group of 171 operators. Tab e 
lists the four items that were considered t0 o 
significantly related to tenure, the per oen 
high and low criterion cases in each gene 
and the scoring weights for the various ae 
gories. In addition, the value of chi ae ee 
for each item along with its degrees of 1" 
dom and probability level are listed. 


Method 


A the 
The quartile difference method involves 
following steps: 


. ni" 

1. Divide the total sample of cases into aH h 
mary group and a holdout group. Working each 
the primary group, compute chi square ee the 
item. Then, compute scoring weights 101, are 
various response categories of the items th" ihe 
Significantly related to the criterion. ee in 


illustrative case, these results are prese? 

able 1, hold- 
2. List the responses of subjects in the j to 
Out group to the i re 


ro sig” 

be significantly related to the criterion and 3% {o 
eso oring weights as determined in Step ct iP 

these responses 4 e 


For example, if a sub) 


io 

dati? 

The step Outlined here follows a cross valid pile 
fee With a holdout group of 176 cases: with 


em selection techni. id ried ouin 
a singl nique may be ca eco” 
mended, fie. of employees, it is strongly 7 yida 


i is at i cross 
tion procedure be aaa asitie, that the 


The Quartile Difference Method of Item Selection 


357 


Table 1 


The Four Items Related to Tenure, Per Cent of High and Low Tenure Cases in Each Response Category, Scoring 
Weights, Item -Chi Square with its Degrees of Freedom and Probability Level 


, % High % Low Scoring Chi 
Item Categories Tenure Tenure Weights* Square D.F. p, 
1. Height-weight ratio 4.931 2 09 
2.00-2.04 14 8 28 
1.70-1.99 47 65 4 
1.45-1.69 39 27 34 
5 : 100 100 
+ Marital status 3,841 1 .05 
single 7 65 36 
married 2 35 8 
100 100 F 
3. When consult physician 4.822 2 .09 
no mention 31 28 28 
0-9 months 41 57 6 
9 months + 28 15 35 
100 100 
4. Education 7.786 2 02 
below high 21 12 31 
high grad. 69 63 28 
above high grad. 10 25 7 
100 100 


* . 
The scoring weights were arrived at by subtrac 


hi 3 
igh tenure cases for each category and adding a const 


€ holdout group fell: in the 1.70-1.99 height- 
aight ratio category, was single, had consulted 
ti Physician more than nine months prior to the 
tatis of application and had a high school edu- 
or ou: the scoring weights (taken from Table 1) 
35, Sia subject would be listed as follows: 4, 36, 


, 
3. Select as the first item to be included in the 
Pattery that item which demonstrated the highest 
St ationship with the criterion as determined in 
bern 1. In this case the item selected was num- 
4 (Education) with a probability level of .02. 

ite, List the scoring weights for the first selected 
and in order of magnitude (from high to low) 
low tally the frequencies of high criterion a 
Scorj terion cases in the holdout group at each 
Caseg © weight. Split the total distribution o 
qua; S at the various scoring weights into an upper 
tter, middle half and lower quarter.” Then, 


Tf Eoi oild 
be, TË the first selected item is dichotomous, it we 
to,” Course, Satie to split the total distribu; 
into pa holdout cases at the various scoring weg 
the igh quarter, middle half and low quartet siie 
two Cases in the holdout group are tallied at pate 
Scorp Oting weights. In situations such as this, the 
ig weights for the first selected item should be 


ting the per cent of low tenure cases from the per cent of 
ant, +22, to eliminate negative weights. 


compute the per cent of high criterion cases in 
the upper quarter-Q,, middle half-Q., and lower 
quarter-Q,. The difference in per cent of high 
criterion cases between the upper and lower quar- 
tiles, Q, — Q,, serves as a measure of item, or 
item combination, discrimination. These compu- 
tations for item 4 are presented as the zero order 
of analysis in Table 2. 

5. Plot the Q,, Q., and Q, values obtained from 
the zero order analysis on a shrinkage chart, Fig- 
ure 1. The horizontal line at the 60 per cent 
point represents the per cent of high criterion 
cases in the holdout group. 

6. Combine the scoring weights for the first 
selected item with the scoring weights for each of 
the remaining items for every subject in the hold- 
out group. This procedure will result in as many 
new distributions of scoring weights as there are 
items to pair with the first selected item. List 
the combined scoring weights for each pair of 


immediately combined with the scoring weights for 
the remaining items as is indicated in Step 6 below. 
In other words, the zero order of analysis is by- 
passed and the researcher goes immediately to the 
first order of analysis. 


358 


items in ordér of magnitude (from high to low) 
and tally the frequencies of high criterion and 
low criterion cases in the holdout group at each 
scoring weight for each distribution of scoring 
weights. Once again split the total distribution 
of cases at the various scoring weights for each 
item combination into upper quarter, middle 
half and lower quarter and compute the per cent 
of high criterion cases for each of these cate- 
gories. For this first order analysis the Q, Q, 
and Q, values as computed for each pair of items 
are entered in the recording sheet, Table 2. The 
second item to be selected for the battery is that 
item which, when combined with the first se- 
lected item, yields the highest Q, — Q, value. In 
this case, the second item to be selected was item 
2 (Marital Status) which, with item 4, yielded 
the highest Q, — Q, value, namely 52. 

7. Plot the Q,, Q, and Q, values for the best 
two items as selected by the first order analysis 
on the shrinkage chart, Figure 1. In this case, 
the Q,, Q, and Q, values plotted for the first 
order analysis were the values obtained for items 
4 and 2 from Table 2. For the illustrative case 
examination of Figure 1, the shrinkage chart, at 
this point reveals that the addition of item 2 in- 
creases the validity of the composite as compared 
with the validity of item 4 alone (the distance 
between the Q, and Q, values continues to spread, 
indicating increased efficiency in Prediction or 
item combination validity), Consequently, the 
item selection procedure is continued, 

8. Combine the composite scoring weights for 
the first two selected items with the scoring 
weights for each remaining item. Once again fol- 
low the computational procedure outlined in steps 
4 and 6 above. For this second order analysis 
the Q,, Q., and Q, values are computed for each 
item triad and recorded in Table 2. The third 
item to be selected for the battery is that item 
which, when combined with the first two selected 


Table 2 


Recording Sheet for Quartile Difference Method of 
Item Selection Computations 


% High Criterion 


Order of oe 
Analysis Item(s) Q Qs Qı Q-Q* Selon) 
0 4 76 62 41 35 4 

1 41 73 52 58 15 
4,2 88 66 36 52 2 
4,3 61 58 37 94 
2 42,1 76 63 29 47 1 
42,3 66 63 32 34 
3 421,3 6 58 44 z 3 
* The highest 


Qr—O% value for e: i 
cated by italics. ach order of analysis is indi. 


‘in the dat 


Norman Friedman 


wo 


© 
o 


PER CENT HIGH CRITERION CASES 
a o 
o o 


Y 
o 


ORDER OF ANALYSIS 


5 i c 
Fic. 1, Shrinkage chart for the Quartile Difference 
Method of item selection. 


items, yields the highest Q, — Q, value. In 
case the third item to be selected was eee 
(Height-Weight Ratio) which, when combin 
with items 4 and 2, yielded the highest Q, ~ “e 
value, namely 47, best 
9. Plot the Q,, Qa and Q, values for the Di 
three items as selected by the second order hag 
sis on the shrinkage chart, Figure 1. In this i 
the Q,, Q., and Q, values plotted for the sec ms 
order analysis were the values obtained for that 
4, 2 and 1. Examination of the shrinkage ch? 
at this point reveals that the addition of eee 
has attenuated the per cent of high criterion ca n 
in Q,, but has continued to decrease the per wall 
of high criterion cases in Q,. Since the Oe 
index of item discrimination, the Q, —Q..¥4 to 
shows a drop of five per cent from the first — 
second order analysis, the researcher might DE 
ably stop selecting items at this point in the 
lection procedure, i 
he analysis in this case was continued ‘iird 
clude all four items. Computations for this ed 
order analysis are recorded in Table 2 and P veals 
in Figure 1. Examination of Figure 1 rehier 
that the inclusion of item 3 results in fur , 
shrinkage (the distance between the Qa ofl 
values decreases). In fact. the predictive than 
clency of all four items appears to be less 
that of the best single item. in the 
he inclusion of the shrinkage chart IP g a 
Procedure is a refinement but by no menec- 
necessity, The researcher could perhaps as ©" by 
tively determine when to stop adding items eh 
examining the trend in Q, — Q, values for midi- 
Successive composite of selected items as ÎS nds 
cated on e recording sheet, Table i 5 ; DM 
a for all of the quartile values: | age 
ever, become more apparent with the shrink egil 
chart, and for this reason, it is probablY -on 


Worth the additional labor needed for its 
struction, 


The Quartile Difference Method of Item Selection 


we 
on 
co} 


% HIGH TENURE OPERATORS 
1 l 1 1 


SINGLE, 


BELOW H. S. 
Eouc. 


SINGLE, 
HIGH SCHOOL 
EDUC. 


SINGLE, 


ABOVE H.S. 
EDUC. 


MARRIED, 


BELOW H.S. 
EDUC. 


MARRIED, 
HIGH SCHOOL 
EDUC. 


MARRIED, 


ABOVE H.S. 
EDUC. 


Fic. 2. Per cent of high te 


of the combined marital statu: 


Results 


ay the basis of the results provided by the 
Wis Selection technique, two of the original 
ede that were considered to be signifi- 
al Md related to tenure (Education and Mari- 
ion as) were chosen to compose the selec- 
rato attery. The per cent of high tenure op- 
hn in the holdout group for each com- 
item ion of response categories for the two 
t, are presented in Figure 2- f 
erm Was felt that presenting these results in 
be -S Of combined response categories would 
in Ore meaningful than presenting them in 
the i composite scoring weights. Actually, 
Corr, isting of combined category responses 
“sponds to scoring weight magnitudes from 


"do ' so ' 


50 


% HIGH TENURE 
OPERATORS IN 
HOLOOUT GROUP 


nure operators for various combinations 


s and education categories. 


high (top) to low (bottom), A rather im- 
pressive and uniform drop in per cent of high 
tenure operators occurs from the first cate- 
gory (single, below high school education) to 
the last category (married, above high school 
education). The trend, while in the right di- 
rection, is somewhat stabilized for the two 
middle categories (single, above high school 
education and married, below high school edu- 
cation). The per cent of high criterion opera- 
tors in the former is 53 and in the latter, 50. 
Chi square was computed for the contingency 
table composed of frequencies of high and low 
tenure operators at the six combinations of 
response categories in order to test the hy- 
pothesis that the relationship expressed here 
could be attributed to chance. The null hy- 


360 Norman Friedman 


pothesis was rejected at better than the 1% 
significance level. 


Summary 


An item selection technique for categorical 
data, the quartile difference method, was de- 
veloped to help the researcher select the most 
highly predictive combination of items from 
a pool of possible predictors. The technique 
while not completely precise (quarter splits 
have to be approximated and consequently 
affect the precision of the quartile values for 
the various analyses) does, however, provide 
a systematic procedure for the selection of 


categorical predictors. The mechanics of the 
method were demonstrated with four items 
that were found to be related to the tenure of 
telephone operators on the basis of item 
analyses with a primary group. It was found 
that a combination of two of these items 
(Education and Marital Status) appeared to 
be more highly predictive of the criterion than 
was any other combination of items. In this 
regard, the per cent of high tenure operators 
decreases as marital status changes from single 
to married and as education increases. 


Received December 19, 1952. 


THE Journat or APPLIED Psy i 
Vo. 37, No. S, 1953 0 


The Construction of a Personality Scale to Predict Scholastic 
Achievement `° 


Harrison G. Gough 


Department of Psychology and Institute of Personality Assessment and Research, 
University of California, Berkeley 


This report describes an attempt to develop 
a brief personality scale to predict college un- 
dergraduate course grades, and particularly 
undergraduate course grades in psychology. 
The study was undertaken with the expecta- 
tion that its findings would contribute to a 
broader understanding of some of the non-in- 
tellective factors relating to academic achieve- 
Ment, particularly those factors having to do 
with personal values, beliefs, and self-defini- 
tions. The construction of the scale repre- 
sents one of a series of studies devoted to the 
Measurement of positive and favorable as- 
Pects of personality and individual function- 
‘ng being carried out by the writer. The pres- 
ent scale, along with a number of earlier scales 
for such factors as social participativeness, 
Ominance and leadership, social responsi- 
ility, and intellectual efficiency, is included 
as a sub-test in the California Psychological 
nventory 2 
ean first step in constructing the Present 
cif e was to assemble a pool of criterlon-spe- 
i © personality inventory items. The writ- 
% 8 and selection of beginning items was based 
pon three general sources: previous find- 
"Bs, theories about academic motivation and 
achievement, and intuitive hunches about con- 
tibutory factors, ‘There is not space in this 


te 
ary to do more than refer t° the procedures 


thenticity of the items themselves. No 
amount of analytical precision at some later 
time can overcome the limitations of an inept, 
superficial, or tangential pool of items. It is 
the writer’s belief that many psychological 
studies on the prediction of complex criteria 
from personality inventory data have floun- 
dered because of failure to observe this sim- 
ple, but fundamental, prerequisite. 

Four original samples were obtained for 
item analysis. These consisted of introduc- 
tory psychology classes at the University of 
California, the University of Minnesota, and 
Vanderbilt University.* Each item was studied 
in at least three of the four samples, and all 
items revealing discriminatory power in each 
instance were retained. Table 1 lists five of 
the items and the basic item analysis sta- 


tistics.° 
The Items 


Altogether, 36 items ° from the pool of 150 
items were retained for the first version of the 
scale, called Hr (for honor point ratio) to dis- 
tinguish it from an Ac—high school academic 
achievement—scale developed earlier by the 


4 These samples were very kindly made available 
by Drs. John Gustad, Rheem Jarrett, and Miles A. 
Tinker. 

5 A longer version of Table 1 giving the item per- 
d significance tests for the complete scale 


$ t ft centages an 3 r 
Sh in writing an { electing items, hut has been deposited with the American Documentation 
Ould and § l Factor {tl te der Document 3947 irom the ADI Aux- 
the e empha ized that a ma or acl i {nstitu “tioations Project, Photoduplication Service, 
the Possible A sized tha í von sutcli 83 iliaty Publicatto ress, Washington 25, D. C., remitting 
esent uccess of any eao = nd au- Library d ates "images 1 inch high on standard 
One is th ridicality an * $1.25 for i r $1.25 for photoprint 
Bra this Proj ail ssearch 35 miT motion picture ig = e 
Natit from ske was carried out under fi Health, readable bpw) these 36 items were taken, by per- 
Serial rg eiiie oi M iie Health y See am the Minnesota eee ore 
2a C, es o ealth, U. >. mission, (Hathaway, S. R., ani cKinley, J. C. 
p ` ry. Py z, ji 
ating Is Paper is a revision and extension of a Pine ae Minnesore _pattibhasieomerola Pres, 1943) 
D, Crican sion given at the annual meetings ington, Minneapolis in the MMPI and the scored responses 
2, Se LYchological Association in Was The in S3E, 78T, 122T, 157F, 248F, 250F 
tog The tember LBAS urie Sr Jol WG9ST, 313F, 395F, 4378, 443F, 469F, 
‘te, fe pmPlete bibliography for the HV ferences 260F) s98F. 
Ay; @ an here. For selecte 492F, 
, (5), and (6). 361 


362 


Harrison G. Gough 


Table 1 


Sample Items from the Hr (Honor Point Ratio) Scale Distinguishing between Students with 
Higher and Lower Course Grades 


Proportion in Each Sample Saying “True” 


California Minnesota Vanderbilt 
Class Class Class 
Higher Lower Higher Lower Higher Lower 
Item (N=50) (N=50) (N=40) (N=40) (N=20) (N=20) 
1. Lawbreakers are almost al- H 62 48 58 25 50 
ways caught and punished. 
2. For most questions there is 22 46 30 38 20 50 
just one right answer, once a 
person is able to get all the 
facts. 
3. It is annoying to listen to a 72 92 65 88 65 90 
lecturer who cannot seem to 
make up his mind as to what 
he really believes. 
4, The future is too uncertain 8 22 8 25 35 45 
for a person to make serious G 
plans. 
5. Teachers often expect too 32 à 
much work from the stu- as 22 52 w 


dents. 


writer (4). The 36 items, and the responses 
predictive of higher grades are given below: 


I Re had bass peculiar an 
ences, (F). 2. ave very few fea; 
to my friends. (F). ef be compared 
part in the entertainment at 
is always a good thing to be frank, (F). 
don’t blame anyone for trying to grab all he 
get in this world. (F). 
in school. (F). 

7. Sometimes without any reason 
things are going wrong I feel excitedly h; u 
top of the world.” (F). 8. Parents ri pe les 
easy on their children nowadays. (F), 9 Teach 
ers often expect too much work from the stu- 
dents, (F). 10. I think I would like to fight tn 
a boxing match sometime. (F), 11, I have often 
found people jealous of my good ideas just be 
cause they had not thought of them first i. 
12. People pretend to care more about on Xe 
other hir ey really do. (F), e an- 

13. The future is too uncertai 
make serious plans. (F). 14. The onan ee 2 
vides temptation by leaving valuable pro pro- 
unprotected is about as much to blame oo 
theft as the one who steals it, (F). 15,7 or its 
the thought of an earthquake, (F) E dread 
bothered by people outside, on eect i 
stores, etc., watching me, (F). 17 I ra 
I have often been punished without e a 


5. I 
can 
was a slow learner 


or even when 


18. I seem to be about as capable and smart i 
most others around me. CT). ing to 
_ 19. I like poetry, (T). 20. It is annoying ke 
sten to a lecturer who cannot seem to E) 
up his mind as to what he really believes. ( A 
20. like to plan a home study schedule and F 
follow it, (F): 22. Our thinking would be 4 ds 
better off if we Would just forget about Wo", 
= “probably,” “approximately,” and “perhaps: 


is ne 
). 23. For Most questions there is just ° 


right angw A all the 
fas. (Fr). once a person is able to get q the 


24. It is all right to get aroun 
law if Pom actually break it. (F). 
' citen lose my temper, (F). 26. ip 
times feel that I am a Darden to others. (F): P). 

a Fea to my father as an ideal = and 

isheg  cakers are almost always cau r- 
haned (E). 29. I liked “Alice i Wont ; 
am an Lewis Carroll, (T). 30. Ihave ĉ rob 
lems, (ry Up easily when I meet difficult P 


a 31. The tro 
on’t take thin i 
8S seriously enough. 
oy oi Would try to change our AM! 
to ote He. (E). 33, even when I do sit 
es iy It is hard to keep my mind on me t 
- (F). 34. It is often hard for at iP 


I some- 


hey 


the questions are driving © ;ne 
ee test. (F), 35. I have to wait TE) 
mood before I can sit down and study: 


R the 
uble with many people is that * 37, 


A Personality Scale to Predict Scholastic Achievement 


Table 2 


Summary Statistics for the Original Samples 
on the Hr Scale 


f 
with 
Course 
Sample N M SD Grades 
i Introductory psychology 180 15.7 26 42 
class at California, 
June, 1950.* 
2. Introductory psychology 67 Sw gi 57 
class at Minnesota, 
August, 1949,** 
3. Introductory experimental 270 24.9 4.0 47 
Psychology class at 
Minnesota, October, 1950. 
+ Introductory psychology 86 21.9 44 47 


Class at Vanderbilt, 
October, 1950, 


Took only 24 of the 36 items in the full scale. 
‘ook only 12 of the 36 items in the full scale. 


36. I plan very carefully about which school 
Courses I will take. (T). 


Results 


This Hr scale was correlated with course 
8rades in the original four samples, totalling 
003 Cases, with the results indicated in Table 

The median 7 is .47, and the mean 7, using 

© 2-transformation, is .48. 


' . Table 3 


Summary Statistics for the Cross-Validating Samples 
Given the Full 36-Item Hr Scale 


r 
with 
Course 
M SD Grades 


Sample N 
` = z 
Introductory psychology TA Bee d 
class at California, 
7 March, 1951, 
117 229 42 -26 


i q oductory psychology 
« 588 at California, 

3. tne 1951, 

To, 

Class p, EOY psychology Ó 


alifornia 
t REUSE tos,” 


363 


Table 4 


Sum mary Statistics for the Cross-Validating Samples 
Given the 32-Item Version of the Hr Scale 
Included in the California Psy- 
chological Inventory 


> 
with 
Course 
Sample N M SD Grades 
1. Introductory psychology 348 21.3 4.0 .31 
class at California, 
October, 1951. 
2. Introductory psychology 23 225 2:6. 26 
class at Stanford, 
March, 1952. 
3. Introductory psychology 211 21.0 4.2 .28 
class at California, 
April, 1952. 
4, Introductory psychology 63 21.3 4.1 .60 
class at California, 
July, 1952. 
5. Introductory psychology 104 22.2 3.0 .32 
class at California 
(Santa Barbara cam- 
pus), December, 1952. 
6. Upper division psychology 139 23.9 34 .39 
class at California, 
April, 1952. 
23.3 3.6 45 


7. Upper division psychology 29 
class at California, 
July, 1952. 


Four cross-validating samples, totalling 336 
cases, were given the initial 36-item scale." 
Table 3 presents the findings. The median 7 
here is .33, and the mean 7, using the z-trans- 
formation, is .38. 

The original 36-item Hr scale contained 
four items pertaining to present attendance in 
school (the last four items in the list above). 
These items were eliminated in the 32-item 
version of the Hr scale included in the Cali- 
fornia Psychological Inventory. This inven- 
tory was given to seven additional college 
samples to obtain cross-validational informa- 
tion on the 32-item scale, when included in a 

large constellation of items." 


z samples were made available through the 
E Y W. Brown, J. McKee, L. Postman, 
cour! 
and R. Tryon. de available through th 
les were made e through the 
s These Ph, W. Brown, J. Clark, P. Farnsworth, 
o rechi, D. MacKinnon, and D. Riley. 


364 


Table 5 


x i ical 
tion of the 32-Item California Psychologica 
ee mn Version of the Hr Scale with 
High School Grade Average 


High School N M SD ry 
1. Butler, Pennsylvania 397 15.6 42 38 
2. Clarksdale, Mississippi 77 143 39 26 
3. Franklin, Pennsylvania 108 15.8 41 .38 
4. Mt. Vernon, Washington 107 16.5 41 .35 
5. Rock Island, Illinois 224 151 43 42 
6. St. Cloud, Minnesota 195 14.7 39 39 


Table 4 presents these data. The total 
number of cases is 917, and the median 7 = 
32. The mean r, using the z-transformation, 
is again .38, 

Because the California Psychological Inven- 
tory is designed to be used in high school as 
well as in college settings, the efficacy of the 
Hr scale in predicting high school over-all 
grade averages was determined. Table 5 pre- 
sents these data.” The total N is 1,108, the 
median 7 = .38, and the mean 7 = 36, 

The Hr scale, along with a wide variety of 
other tests, was also given to a sample of 
40 senior medical students seen at the Uni- 
versity of California Institute of Personality 
Assessment and Research in an intensive as- 
sessment program.’® Some of the more promi- 
nent findings are presented in Table 6. 

Perhaps the most important observation 
here is that the Hr scale correlates with cri- 
terion ratings of achievement in medicine as 
well as it does with undergraduate course 
grades in psychology, Furthermore, its pat- 
tern of correlation with the other variables 
listed is uniformly favorable, with the Possible 
exception of the staff rating on impulsivity, 

One of the questions which might now be 
raised is whether the Hr scale is assessing any 
independent achievement variance, or whether 
it is primarily an indirect measure of intellect, 
Table 7 affords evidence relevant to this query, 


9 These samples were made available th 
kindness of Mr. C. O. Austin, Mrs, M coe 


. S. 
Mr. G. N. Harriger, Mr. H. B, Heidelberg, Messe 
Sorenson, and Mr. R. F. Wilson. ied 
70 The research at the Institute of Personality As 
sessment and Research is being conducted under a 


grant from the Rockefeller Foundation, 5 5 
ence (3) for a discussion of the work of this he 
tute. Š 


Harrison G. Gough 


The six correlations with IQ in the high 
school samples are all lower than they are for 
Hr vs. grade averages, and a similar difference 
obtains for the college sample. In the po 
tary sample of 150 cases Hr correlates on : 
-10 with intellect, but .50 with a ieee 
scholastic achievement. The mean r with t 
intellectual variables in Table 7 is .26, an 
with the indices of achievement is .38. 

If these values are taken as reasonable ap- 
proximations of the true parameter values, an 
estimate of the multiple R between IQ, Hr, 
and scholastic achievement can easily be ae 
For the typical value of .50 between IQ i? 
grades, the multiple R would be .57, for d 
value of .60 the multiple R would be .64, an 
so on. Hr would thus appear to be a partia 


Table 6 


5 
Correlation of the Hr Scale with a Variety of a 
and Assessment Variables in a Sample of 4 
University of California Senior 
Medical Students 


Variables 


+ Medical faculty criterion ratings. 


1 
a. Potential success si 
b. Originality 
2. Assessment stafi ratings. 56 
a. Personal tempo „51 
b. Breadth of interests 46 
c. Vitality a 
d. Impulsivity Al 
e. Verbal fluency 39 
f. Originality 3 
8g. Positive affect 20 
h. Rigidity 
3. Ratings of performance in improvisations. AA 
a. Dominance 34 
b. Flexibility 35 
c. Ingenuity 
4. Ratings of Performance in charades. 4 
a. Motility 34 
b. Over-all effectiveness -2 
c. Perseveration 3 
g d. Self-consciousness : 
3. Perceptual-cognitive variables. 
a. Size Constancy estimation (near and far 5 
triangles, smallness of error in judging) 
b. uminous tilted square, total error in _ 3! 
adjusting inner line to upright 1 
c. Street Gestalt pictures, accuracy of rec- 3 


gnition 


Oe ee 


A Personality Scale to Predict Scholastic Achievement 365 
£ 
i Table 7 
Comparative Correlations between Hr and the Intellectual and Achievement Variables Indicated 
Correlation of Hr with 
Intellectual Achievement 
Sample N Variable* Variable** 
I. High Schools 
1. Butler, Pennsylvania 397 33 38 
2. Clarksdale, Mississippi 77 10 26 
3. Franklin, Pennsylvania 108 37 38 
4. Mt. Vernon, Washington 107 30 35 
5. Rock Island, Illinois 224 32 42 
6. St. Cloud, Minnesota 195 33 39 
IL. College 
1. University of California, Santa 104 22 32 
Barbara, psychology class 
NI. Other = 
~ £} Military Officers 150 10 50 
re used. In the college sample the cri- 


* 
terj 


ilities T 
Ak 
the Da the high school samples the criterion w 


ment ge grade in psychology, and in the military 
7 Seading Comprehension in the Social Sciences. 


Table 8 


Corr, 5 
elation of the Hr Scale with Other Scales from 
€ California Psychological Inventory, 1 & 
Nationwide High School Sample* 


LR CPI Scale Females Males 
2, To (responsibility) re Te 
3. p (tolerance) cA i 
4g (flexibility) n a 
es (status) 53 AS 
6. sp (dominance) 31 24 
Vp (Social participation) 38 19 
8. k emininity) i o1 
9, Te (delinquency) 27 -21 
10, Ag (intellectual efficiency) ‘68 35 
~ (academi 5 
h mic achievement, i 
u, A School) = 5 
K (paychological interests) 45 46 
cademic ivati 
Bri motivation, 
13. 4 eunte school) ü p 
it o, peurodermatitis) —.26 -26 
` x =e and spontaneity) 33 .27 
* Gmpulsivity and self- 
S y and self 
lg, Ty redness) _.40 i 
n, Gi (infrequency) or = 03 
8, Sood impression) 3 37 
x8 = 


dissimulation) 


* 
he 
in 45:423 i 

13'stata Males, 2,077 males, from 16 high schools 


Tn the hi er 7 
igh up tests of intelligence we 
igh school samples, standard group pee 


ion ; 
Al tee ‘Altus Measure of Verbal Aptitude, an 


as the over-all hi 
sample the U 


sample the Thurstone Primary Mental 


igh school grade average, in the college sample 
SAFI Test of General Educational Develop- 


redictor of academic outcomes in its own 
right without drawing to any great extent on 
intellectual factors, and can also add slightly 
to the multiple R prediction of grades from 
measures of intelligence. 
The intercorrelations of Hr with the 18 
other scales on the CPI are presented in Ta- 
ble 8. The highest relationships are with the 
scales for tolerance, flexibility, intellectual effi- 
ciency, and psychological interests. 

The final information presented in this pa- 
per has to do with the social psychological im- 
plications of higher and lower scores on the 
Hr scale. In the research program at the 
Institute of Personality Assessment and Re- 
search previously referred to, each staff mem- 
ber filled in a Gough Adjective Check List (3) 
about each assessee. For some of the analyses 
these observers’ reports were composited into 
a single “general observer’s” report by con- 
sidering each adjective checked by at least 2 
out of 6 senior staff members as being “pres- 
ent,” and as being “absent” if checked by 
only one, or by none. 

These composited adjective check lists were 
to carry out an analysis of the social 


used 
Two sam- 


stimulus values of the Hr scale. 


366 


ples of 30 each were drawn by selecting the 
10 highest and 10 lowest subjects on the Hr 
scale from two graduate student samples of 
40 each, and from the sample of 40 medical 
school seniors already mentioned. A study 
was then made of what observers did, in fact, 
say about the 30 highest ranking students, as 
compared with what they did, in fact, say 
about the 30 lowest ranking students. The 
adjectives showing statistically significant dif- 
ferentiations are listed below: 


I. Adjectives checked more frequently about 
higher-scoring subjects on the Hr scale. 


adaptable determined persevering 
alert efficient planful 
ambitious fore-sighted pleasant 
appreciative honest rational 
capable industrious reasonable 
clear-thinking intelligent realistic 
conscientious interests wide reliable 
cooperative logical responsible 
dependable organized resourceful 


II. Adjectives checked more frequently about 
low-scoring subjects on the Hr scale. 


cautious 


nervous sentimental 
dissatisfied preoccupied shy 
dull rebellious wary 
immature rigid 


The patterning of these adjectives is very 
consistent. “Highs” are seen as alert, clear- 
thinking, efficient, intelligent, pleasant, and 
resourceful. “Lows” are seen as dull, imma- 
ture, rebellious, rigid, and wary. The staff 
raters, of course, had no information whatso- 
ever about the Hr scores of these subjects, 


Summary 


A personality scale to 
ate grades was developed. A mean r with 
course grades of 38 in eleven cross-validat- 
ing college samples totalling 1,253 cases was 
attained. The Hr scale also Predicted high 


predict undergradu- 


Harrison G. Gough 


school grades, giving a mean r of .36 in six 
high school samples totalling 1,108 cases. 

Evidence from eight samples, including 
1,362 cases, was adduced to support the claim 
that the Hr scale is a predictor of academic 
achievement and not simply an indirect and 
inefficient measure of intellect. In these sam- 
ples the mean correlation of Hr with measures 
of intellect was .26, and with indices of aca- 
demic achievement was .38. p 

Additional findings in a sample of 40 senior 
medical students revealed a significant cor- 
relation between Hr and ratings of success 10 
medical training, and between Hr and a num- 
ber of assessment variables such as breadth 
of interests, originality, flexibility, vitality, ef- 
fectiveness in group discussion and in cha- 
rades, and adequacy of performance on pe 
ceptual-cognitive tasks involving complex 
judgmental decisions. 

The final section of the paper listed some 
of the more prominent social-interactional im- 
plications of higher and lower scores on the 
Hr scale. High scorers tend to be seen as C4” 
pable, intelligent, and reliable and low scorers 
as dissatisfied, dull, rigid, and shy. 


Received November 20, 1952, 


References 


1. Gough, H. G: Some common miscon 
about neuroticism. Psychol. serv. cent. 
in press. 

2. Gough, H. G, 


ity. Educ, 
439. 


ceptions 
3 


: inin- 
Identifying psychological fee 
psychol. Measmt., 1952, 12, 


Predicting success in, sro ji- 
' progress report. Berkeley, jtute 
fornia: The University of California D 50. 
of Personality Assessment and Research, 
P. 1-65 (mimeographed) I 
. mi 
4; Gough; H. G. What determines the aca 
achievement of high school students? +: 
es, 1953, 45, 321-331. 


3. Gough, H, G. 
training: A 


* Gough, E. G., MeClosky, H, and Mechh Pi. 
Personality scale for social responsi! 
- abn. soc, Ps 3-80. ifi- 
6: Gough, H. G, and Peterson, D.R. The ide 
cation and measurement of predisposti ult- 
poe 1n crime and delinquency. 
Schol., 1952, 16, 207-219. 


Tue Journat or Appiiep PsycHoLocy 
Vol. 37, No. 5, 1953 


Kuder Interest Patterns of Medical, Law, and Business School 
Alumni 


Robert H. 


Shaffer 


Office of Dean of Students, Indiana University 


and 


G. Frederi 


c Kuder 


Duke University 


This paper reports the comparative scores 
made on the Kuder Preference Record by sam- 
Ples of graduates of the Indiana University 
Schools of Medicine, Law and Business who 
had graduated in 1941 or previously. In 
1950, the Preference Record (Form C) was 
Sent to 996 graduates of the School of Busi- 
Ness, 764 graduates of the School of Law and 
992 graduates of the School of Medicine. Re- 
turns were received from 313 for Business, 
210 for Law and 242 for Medicine. The 
Mean ages of the respondents were 37.5 years 
for Business, 45.2 for Law and 45.2 for Medi- 
cine. Each return indicated the individual’s 
Present occupation. Striking and significant 
differences have been found among the inter- 
€st patterns of the various groups studied. 

The first comparison presented is between 
he interests of doctors and those of lawyers, 
accountants, and other Business School gradu- 
ates. Table 1 gives these data. Since ac- 
Countants differ so markedly from other busi- 
ness groups they have been kept separate in 


practicing lawyers from the law school are re- 
ported in this table. The data from non- 
lawyers are presented later. 

In this and the following tables the mean 
raw scores reported in the first line are taken 
as the basis for comparison. The ż-test was 
used to determine the significance of differ- 
ences of means from the base group. 

The standard deviations of all groups 


“studied are reported in Table 3. 


Results 


Inspection of Table 1 reveals that doctors 
had scores significantly different at the 1% 
level from lawyers on one of the ten scales, 
and from the business groups on seven and 
eight of the scales. As a general pattern, 
when compared to the other groups, doctors 
were higher at the 1% level of significance on 
the scientific, social service, artistic and out- 
door scales and lower at the same level of sig- 
nificance on the computational, persuasive, 


the table. It should be noted, too, that only and clerical scales. 
Table 1 
Mean Interest Scores of Lawyers and Businessmen Compared with Those of Doctors 
Ons / Soc. Cleri- 
door Mech. Comp. Sci. Pers. Art. Lit. Mus. Serv. call 
Med. Sch. Grads 477 409 23.9 49.6 30.7 245 «218 128. 451 37.5 
N = 24 P 
“icticing —_— soap 332t 278 366t 416" 202} 26.5 13.7 30.2" s08 
= 14 y 
ecountants agog 39.8 W1% asset 39.6% 17.5} 23.0 120 32.7} 60.3** 
N= i 
E AE ai 377} 298 340} SLI® 19.6} 218  141* 39.2} 503" 
3 37.2} 
Sets. (N = 269) 
** Signi i level of confidence. 
aS antiy higher st ie leval of confidence. 


significantly higher at the Hier of confidence. 


367 


t Significantly lower at the 1% 


368 


Robert H. Shaffer and G. Frederic Kuder 


Table 2 


Mean Interest Scores of Accountants and Other Business School Graduates 
Compared with Those of Lawyers 


Out- Soc. Cleri- 
door Mech. Comp. Sci. Pers. Art Lit. Mus. Serv. cal 
Practicing Lawyers 40.1 33.2 27.8 36.6 41.6 20.2 26.5 13.7 39.2 50.8 
(N = 148) : x 
Accountants 38.0 398 42.2" 35.8 39.6 17.5 23.0} 12.0 32.7} 60.3* 
(N = 44) ; 
Bus. Grads excl. 37.2} 37.7** 29.8* 34.0f 51.1** 19.6 21.8} 14.1 39.2 50.. 


Accts. (N = 269) 


** Significantly higher at the 1% level of confidence. 
* Significantly higher at the 5% level of confidence. 
t Significantly lower at the 1% level of confidence. 
f Significantly lower at the 5% level of confidence. 


Table 2 gives the data resulting from a 
comparison of the interest scores of lawyers 
to the two business groups. As a general pat- 
tern the lawyers were significantly higher than 
businessmen other than accountants in the 
literary and scientific areas and lower in the 
persuasive and mechanical areas, The com- 
parison with accountants differed from this 
pattern. The lawyers had lower computa- 
tional, clerical and mechanical scores, and 
higher social service and literary scores than 
accountants. The comparison of lawyers with 
physicians was noted in the discussion of Ta- 
ble 1. 

It is interesting also to note differences be- 
tween subdivisions of the graduates of the 
various schools. Table 3 gives these data. 
The medical school graduates were scattered 
among a number of specialties, but some did 
not report enough detail to allow a more spe- 
cific classification than that of physician. 
However, there were enough who could be 
classified specifically as surgeons and phy- 
sicians-in-general-practice to justify a com- 
parison of the scores from the two groups. 
The most significant difference between these 
groups is on the social service scale, the phy- 
sicians-in-general-practice being significantly 
higher within the 1% level of confidence 
They are also higher on the scientific scale 
but only at the 5% level. A trend well with- 
in the 10% level of confidence may also be 
noted for surgeons to be higher on the me- 
chanical scale. 

Although it appears that the graduates of 
the medical and business schools stay in these 
fields, this generalization does not hold for 


the law graduates. Perhaps the distinction 
between law and business is the result of the 
terminology used, since “business” covers 4 
tremendously wide range of activities. A Pet 
son could change his occupation greatly a” 
still be in the field of business. At any rate, 
of the law school graduates responding, 29.570 
reported they were in occupations other than 
law. Most of these are in related fields often 
involving managerial or administration WO" 
in business and industry where they presum™ 
ably have occasion to apply their training 1" 
law. Those who are not actually practicing 
law are significantly and perhaps surprising!’ 
higher on the persuasive scale. This is the 
only significant difference noted between t 
two groups, as shown in Table 3. 

The graduates of the business school até 
a wide variety of occupations and except 1° 
the accountants the occupational groups ies 
too small to justify a breakdown analys!> 
Hence the scores of the accountants are COM 
pared with those of the remaining gradua 
of the business school, The results are ae 
ported in Table 3. As might be expects e 
the accountants are significantly higher in : e 
computational and clerical areas. A negatiy 
difference of lesser significance may als? gå 
noted on the musical scale. A large proP°Y 
tion of the non-accountant group is in 
agerial or sales occupations. These Tê 
are consistent with those previously report? 
for senior men in the I. U. School of “d 
ness. Senior accounting majors were fow 

* Shaffer, R, H, Kuder i rns of un 
TS bine school ‘seniors 7. appl Z9 


in 


ts 
sur 


a 


Interest Patterns of Medical, Law, and Business School Alumni 369 
Table 3 
Mean Interest Scores of Sub-Groupings of Doctors, Lawyers and Businessmen 
Soc. Cleri- 
Group Outdoor Mech. Comp. Sci. Pers. Art. Lit. Mus, Serv. cal 

Med. School 

Graduates M 47.7 40.9 23.9 49.6 30.7 24.5 21.8 12.8 45.1 37.5 

N =W} SD 14.3 11.8 8.3 8.4 11.4 8.9 Bs 6.5 12.6 10.4 

Surgeons M 514 44.5 22.1 47.9 31.5 24.5 22.7 13.4 414 37.2 

N=50 SD 11.8 12.5 8.8 9.1 10.4 8.6 6.7 6.9 13.4 114 

Physicians-in- 

Gen’l-Pract. M 48.6 40.6 24.7 51.2* 31.1 23.9 20.4 11.0 48.3** 37:9 

N = 66 SD 14.0 10.8 8.2 Ad 12.3 8.4 6.8 6.2 11.6 10.5 
Law School Grads. p 

Lawyers M 40.1 33.2 27.8 36.6 41.6 20.2 26.5 13.7 39.2 50.8 

N = 148 SD 14.7 13.1 8.1 10.2 13.7 8.6 7.2 6.6 12.0 13.1 

Non-Lawyers M 42.2 34.4 26.1 34.0 47.2** 20.0 26.2 11.9 38.6 50.6 

N = 62 SD 15.6 13.1 8.8 8.4 16.3 8.4 7.6 6.2 12.8 14.2 
Business School 

rads oth = 
than Acols. M 37.2 37.7 29.8 34.0 51.1 19.6 21.8 14.1 39.2 50.3 
; N = 269 SD 14.0 12.5 9.9 10.3 15.4 8.5 8.1 6.2 11.9 13,1 
Accountants M 38.0 39.8 42.1** 35.8 39.6} 17.5 23.0 12.08 32.7  60.3** 
N SD 14.4 12.8 8.0 10.1 14.3 9.1 7.3 5.7 12.2 11.6 


= 44 
t As indicated, 242 Medical School graduates returned questionnaires. A large number of respondents 


mer vere “ Ai indicating actual type of their practice. To prevent erroneous 
ee ag tliey! wore: a instead of i B eks eral practice” or “surgeon” were used for 
, 


only those who indicated they were in “gen ) y 
Statistical ie fma ol The t-test for Med. School graduates was confined to comparison of surgeons and physi- 
Clans-in-general-practice. 
** Significantly higher at the 1% level of confidence. 
* Significantly higher at the 5% level of confidence. 
t Significantly lower at the 1% level of confidence. 
§ Significantly lower at the 5% level of confidence. 


to have significant differences in all nine scales 
Of the Kuder Form B when compared to all 
Usiness seniors. 


Summary 


The Kuder Preference Record (Form C) 
Was given to a sampling of the 1941 or prior 
Btaduates of the Indiana University Schools 
of Medicine, Law, and Business. The mean 
Taw scores of these groups and sub-groups 
Were compared with the following results: 

1. Significantly different interest patterns 
pe found for doctors, lawyer, and business- 


and scientific areas and lower in the per- 
suasive and mechanical areas. The compari- 
son with accountants revealed a different gen- 
eral pattern. The lawyers had lower com- 
putational, clerical, and mechanical scores, 
and higher social service and literary interest 
than accountants. 

4. Physicians-in-general-practice were found 
to be higher on the social service and scientific 
scales than surgeons. There is also a trend 
of less statistical significance for surgeons to 
be higher on the mechanical scale. 

5. The graduates of the law school who are 
not practicing law were found to have a higher 


i than the average persuasive score than the practicing 
“ , were higher aD lawyers. 
Othe In general doctors were ! , scientific, 6, Accountants had higher computational 


r $ 1 
àrtisg "Oups on the soclal sel ls 
Coma and outdoor scales, and jowe to 

3, p'ational, persuasive, and clerical sen f : 
than “*Wyers compared to businessmen ot g 
Ccountants were higher in the literary 


and clerical scores than other business groups, 
fi d Jower social service and persuasive scores. 
al 


Received November 10, 1952. 


THE JOURNAL or APPLIED PSYCHOLOGY 
Vol. eth No, 5, 1953 


Kuder Interest Patterns of Student Nurses 


Alma Perry Beaver 


University of California, Santa Barbara College 


A time-honored means of selecting candi- 
dates for a given profession or vocation has 
been the use of the interest or preference in- 
ventory. In the opinion of many investiga- 
tors, interest patterns are more indicative in 
such selection than are the data afforded by 
personality schedules and measures of apti- 
tude now in use. In reporting a comparative 
study of students preparing for five selected 
professions, Blum states (2): “It is significant 
that the greatest differences . . . were in their 
vocational and non-vocational interest tend- 
encies rather than in personality traits. . . .” 
Triggs (3), drawing upon her wide experi- 
ence in counseling individual nurses, makes 
the observation that of those students who 
fail or withdraw from the nursing curriculum, 
the most common finding is deviation in 
scores on the interest inventory; in no other 
respect is she so likely to deviate from the 
usual pattern of scores made by the suc- 
cessful nurse. Using the Kuder Preference 
Record with a group of nurses and a group 
of women-in-general, Triggs found that it did 
an excellent job of differentiation. The writer 
has attempted a similar study with student 
nurses and liberal arts college girls with an 
education major. 


The Present Study 


The experimental group consisted of 80 
students in Knapp College of Nursing in 
Santa Barbara, California, ranging in age 
from 17 to 25 years, all Caucasians with the 
exception of one Japanese girl. Matched in- 
dividually for sex, age, percentile on ACE 
and race, the control group of 50 girls was 
selected from liberal arts college students 
with an education major enrolled 


y Hou in the Uni- 
versity of California, Santa Barbara College 
Table 1 presents these data. The Kudet 


Preference Record was admini 
student nurses as a part of t 
qualifying tests given prior to 
the Knapp College of Nursing. 


stered to the 
he battery of 
admission to 

The educa- 


370 


tion majors took the inventory on request, aS 
one of a short battery of tests. 

Table 2 ' gives the mean percentile scores 
for each of the nine scales of the Kuder 
Preference Record for both the experimental 
and the control groups. Also given are the 
sigmas for each scale, the standard daonna 
of the means, the sigmas of the difference o 
the critical ratios of the difference. Four ° 
the nine scales yield critical ratios at the 01 


Table 1 


Matching Variables, Experimental and Control Groups 


Control - 


Experimental 
Group Group 
Variables N = 80 N= 
Age, Mean 18.7 18.8 
Age, SD 1.5 oe 
ACE, Mean percentile 34.4 
ACE, SD 22.6 a: 


level of confidence. Science, with a mea” 
score of 64.6 for the nurses and a mean score 
of 46.4 for the education majors, has aa 
value of 8.21. Furthermore, the Persuasi? 
Literary, and Social Service scales also nn 
highly significant differences betwee? 97 
Sroups, the respective ¢ values being aly 
4.17 and 4.94, Figure 1 presents graphic of 
the means of the two groups for all scales m 
the Kuder Preference Record. Ranked e 5 
highest to lowest CR value, the four a 
which are least significant in differentiate 


a 
the groups are Musical, Artistic, Mecham? 
and Clerical. 


A less cony 
the Kuder Pr 
taken, 
each ite; 


te. YOU 
entional type of analy: ce et 
eference Record was also UP“ sr 
The “most” and “least” choices ive 
™ in the clusters of three in all tW 


„p the 
` Tables 2 and 3 have been deposited with ocu. 


uxiliary Publicati j Order tio" 
ment No. ions Project. jicatiO’ ¢ 
Project, Chis iom ADI Auxiliary Publics, o 


» Photoduplication Service, Libra) for 
Congress, Washington 25, D. C., remitting $ et 
Photocopies (6 x g inches) or $1.75 for micro” 


ý, 


G 


Kuder Interest Patterns of Student Nurses 


SOLID LINE =NURSES DASH LINE * COLLEGE FRESHMEN 


COMPUTATIONAL 
SCIENTIFIC 
PERSUASIVE 
ARTISTIC 
LITERARY 
CLERICAL 


M 
2 
S 
c 
& 
S 
a 
E 
Š 
o 
o 


PERCENTILES 


TEAVA 
20 |_| 
olf 1 


ied 1. Comparison of mean percentile scores of 

cadet nurses and 50 college freshmen on 9 cate- 

foes Kuder Preference Test. Significant at the .01 

tu el = science, CR 8.21; persuasion, CR 4.97; litera- 
re, CR 4.17; and social service, CR 4.94. 


Columns of the answer booklet were computed 
for both the experimental and the control 
groups. The CR of the percentage of the 
difference for the items was then computed. 
ia of 76 items were found to be sig- 
ine Cant at the .01 level of confidence. This 
mber included 40 items in which the choice 

the item as “most” served as the basis for 


371 


the differentiation of the groups. Another 
16 items were found in which the choice 
“least” by the groups served as the basis for 
identification. In still another 20 items, 
either choice, “most” or “least,” yielded ¢ 
values at the .01 level of confidence. In sum, 
then, there were found 76 items which permit 
96 choices which appear to be valid for dif- 
ferentiating the student nurse from the liberal 
arts college education major. The ¢ values 
for the 96 choices ranged from 2.62 to 8.22. 
The highest value was obtained for the item 
“Be a chemist.” It is checked as a “most” 
choice by the nurses in the cluster that also 
includes “Be a machinist” and “Be an archi- 
tect.” Table 4 lists samples of the 76 items.* 

Further examination of these choices in 
terms of their meaning to the student nurse 
or education major gives evidence of a ra- 
tional basis for most of the items. The stu- 
dent nurse is or is expected to be vitally in- 
terested in science, chemistry, working in a 
laboratory and in its equipment, the discovery 
of cures, and the care of sick people. When 
an item of this type is checked as a “most” 
choice by the student nurse, it usually has a 
high ¢ value; that is, it distinguishes the stu- 
dent nurse from the education major because 
members of the latter group seldom check 
such items as a “most” choice. The educa- 


2Table 3 in its entirety is deposited with ADI. 
See footnote 1. 


Table + 


Samples of the Significance of the Difference Betwee 
Women in Education Curricula to Most 


n the Resp 
and “Least” Choices on the Kuder Record 


onses of 80 Student Nurses and 50 College 


Item Triads 


(r) Take a course in sketching. 
(s) Take a course in biology. 
(t) Take a course in metal working. 


(R) Do chemical research, 
(R) Do chemical research. 
Interview applicants for employment. 
(T) Write feature stories for a newspaper. 


(G) Write a political campaign song. 
(A) Write an article on how machine tools are made. 


t Item Marking 
Value and Group 
of Diff. Involved 
4.44 “most,” Educ. 
5.19 “most,” Nurses 
6.18 “most,” Nurses 
5.77 “Jeast,” Educ. 
4.40 “most,” Educ. (S) 
3.18 least,” Nurses 
3.53 “most,” Nurses 


(J) Design a computing machine. 


372 


tion major prefers teaching children, writing 
a best seller, being a journalist, scoring ex- 
aminations as a means of earning pin money 
and interviewing people in a survey of public 
opinion, to name a few of the interests which 
her choices reveal. What does the nurse want 
“least” to do? It will be recalled that the 
nurse scored low in the Persuasive area. She 
has no interest in selling nor is she interested 
in writing a newspaper column. Being a 
journalist or a literary critic or a famous radio 
commentator has no appeal for her. The 
college education major, on the other hand, 
looks with disfavor upon work in chemistry, 
anything to do with a laboratory or research 
equipment, and anything associated with a 
hospital.* 

A few seemingly odd choices on the part of 
both groups require interpretation. A “most” 
choice of the education major, the £ score 
rating of which is second in magnitude to all 
the ratings, seems highly peculiar in the light 
of what the writer believes to be her interests, 
The item is “Be an architect.” This is one 
of the cluster mentioned previously in which 
the “most” choice of the student nurse is “Be 
a chemist.” The education major obviously 
does not want to be a chemist; nor is she in- 
terested in being a machinist. She is thus put 
in the position of making a forced choice, and 
checks the least offensive item, which is for 
her “Be an architect.” For this same cluster 
the “least” choice of the education Major is 
“Be a chemist.” The critical ratio of the dif- 
ference for this “least” choice is 3,04, Other 
possible forced choice answers may be cited 
The item “Sell musical instruments” js checked 
as a “most” item by the education majors 
With a ¢ rating of 5.26 the choice is contrasted 
with the nurse’s choice of “Help in a sick 
room.” Neither wishes to “Repair household 
appliances.” Another example of a forced 
choice, “Design a computing machine” js 
checked as a “most” choice by the student 
nurses in preference to “Write Political 
campaign song” or “Write an article on how 
machine tools are made.” 


In a previous study by the writer (1) of 
3 These students are working toward certification 


in the fields of early childhood educati 
d y ati M 
education, and physical education at pote Ens jie 
3 s. 


Alma Perry Beaver 


responses to the MMPI made by student 
nurses and a matching group of college educa- 
tion majors, 66 items were singled out which 
differentiated the groups at the .05 level of 
confidence or better. Of these items, 23 were 
found to be significant at the .01 level. On 
the basis of an analysis of these items, pre- 
sumptive evidence of personality attributes 
possessed by the student nurse was presented. 
A point of interest in the present investigation 
was the possibility of parallel findings or re- 
lated data in the patterns of response of the 
student nurse to two inventories designed to 
explore different facets of personality. Find- 
ings are largely negative. Only one striking 
parallel exists. The two items with the high- 
est ¢ ratings on the previous study were e 
like science” and “I like to read about 
science” respectively. The two items with 
highest ¢ ratings in the “most” choices of the 
student nurse on the Kuder Preference Record 
are “Be a chemist” and “Give popular lec 
tures on chemistry.” As stated previously; 
one can readily infer a logical basis for th 
nurse’s choice, 

Triggs (3) obtained Preference Record 
Scores on 826 graduate nurses and compare 
them with the scores of 1246 women-in-ge? 
eral. She found that the interests of nurses 
differed significantly at the .01 level fro™ 
women-in-general on all scales of the Prefe' 
ence Record except on the Artistic, where the 
difference was found to be significant at o 
Vs level only, and on the Mechanical, W a 
no significant difference was found. Listed ! 
the order of magnitude, the scales showing 
positive magnitude were Social Ser yie 
Science, Artistic and Musical. Those aa 
Cle negative differences were Persuasy,, 
of en related to the present stu 4) in 

1S another study by Triggs ( 


Which she used the scores of the Kuder Pre me 


t 
ence iffere” 
: Record to determine whether diffe of 
Interest Pattern, 


i s exist in specialized fe? sor 
2 ne: She reports reliable differences? sig 
tile, the Public Health nurse makes d 
Social, rd higher scores on the Persuasi?” set 

ervice scales, and significantly, á ica! 


Scores on th 
e Com ; d 
scales, ıputational an 


Kuder Interest Patterns of Student Nurses 


Summary and Conclusions 


1. An investigation into the interest pat- 
terns of student nurses as contrasted with stu- 
dents majoring in education curricula in a 
liberal arts college utilized the responses of 
the respective groups on the Kuder Preference 
Record. A total of 80 Knapp College of 
Nursing students were matched for race, age, 
and percentile on ACE with 50 education 
majors from the University of California, 
Santa Barbara College. 

2. Mean scores of the two groups in four 
of the interest areas yielded critical ratios of 
the difference at the .01 level of confidence. 
The Science scale with a mean score of 64.6 
for the nurses and 46.4 for the education 
Majors has the highest ¢ value of 8.21. The 
areas Persuasive, Literary and Social Service 
Yielded respective ¢ values of 4.97, 4.17 and 
4.94, 

_ 3. Another form of analysis was attempted 
in order to identify items in the clusters of 
three in which “most” and “least” choices 
Showed a valid difference for the experimental 
and control groups. A total of 96 choices 
Made in response to 76 items were found to 
be significant at the .01 level of confidence. 

f these, 60 were “most” choices and 36 
least” choices. The ¢ values for the in- 
dividual items varied from a high of 8.22 to 
a low of 2.62. 

« & The student nurse by reason of her 
Most” choices manifests interest in or prefer- 
ence for science, anything pertaining to the 
aboratory, the discovery of cures, and the 
Care of the sick. The college education major, 


373 


on the other hand, anticipates a liking for 
teaching children, is interested in various 
forms of writing, and in interviewing people 
for public opinion surveys. The “least” 
choices of the student nurse are heavily 
weighted with pursuits which require per- 
suasion or selling, any form of writing, re- 
porting or literary criticism. The education 
major’s “least” choices indicate antipathy for 
work in chemistry, anything that has to do 
with laboratory or research equipment and 
anything associated with a hospital. 

5. An attempt to find similar personality 
trends in the response patterns of the student 
nurse to the Kuder Preference Record and to 
the MMPI was not successful. A previous 
study (1) furnished data for the compari- 
son. In one respect only were the findings 
comparable. The two items in the MMPI 
receiving the highest ¢ value referred to a 
liking for science. Similarly in the Kuder, 
the two “most” choices of the nurses having 
the highest rating referred to interest in being 
a chemist or giving lectures in chemistry. 


Received June 2, 1953. 
Early publication. 


References 


1. Beaver, Alma P. Personality factors in choice of 
nursing. J. appl. Psychol., 1953, 37, 374-379. 

2. Blum, L. P. A comparative study of students 
preparing for five selected professions includ- 
ing teaching. J. exp. Educ., 1947, 16, 31-65. 

3. Triggs, Frances O. The measured interests of 
nurses. J. educ. Res., 1947, 41, 25-37. 

4. Triggs, Frances O. The measured interests of 
nurses: a second report. J. educ, Res., 1948, 


42, 113-121. 


THE JOURNAL OF APPLIED PSYCHOLOGY 
Vol. 37, No. 5, 1953 


Personality Factors in Choice of Nursing 


Alma Perry Beaver 


University of California, Santa Barbara College 


Screening devices for choosing candidates 
for a given profession or occupation have in 
the main centered in the cognitive rather than 


the affective areas. Measurement of aptitudes - 


has achieved a degree of objectivity and va- 
lidity that is still somewhat rare in the less 
tangible areas of personality and motivation. 
Yet the importance of the latter factors has 
not gone unnoticed. In the field of nursing 
subjective evaluations have pointed to cer- 
tain intrinsic personality qualifications as es- 
sential to success. Disciplined efficiency un- 
der emergency conditions and the ability to 
give comfort and reassurance to the patient 
are among the demands made upon the nurse. 
More specific identification of essential traits 
and a means of measuring them is a goal not 
yet achieved. 

The belief that nurses as a group do repre- 
sent a more stable segment of the population 
than the average has had popular acceptance 
for some time, Studies bulwarking this be- 
lief have not been lacking. In 1927 Elwood 
(1) studied two groups, one made up of 
nurses, the other college girls. Using Laird’s 
Introvert-Extrovert Scale and Woodworth’s 
Emotional Inventory, he concluded that both 
tests placed the nurse in a more favorable 
light. Lough (3), using the responses on the 
Minnesota Multiphasic Personality Inven- 
tory as a basis for her analysis, compares 
nursing students with women students en- 
rolled in liberal arts and education curricula 
She reported the nurses as being more stable 
than the other groups and as having more 
masculine interests. In a subsequent study 


Lough (4) substantiates her findings through 
a statistical validation of the differences found 
between the cadet nurses and the students of 
General Curriculum. Healy and Borg (2), 
using the Guilford-Martin battery of person- 
ality tests measuring thirteen putative fac- 
tors, compared a group of nursing-school 
freshmen from six schools of nursing with 
students at the University of Texas. They 
found no characteristic pattern in the analy- 
sis of scores of the beginning nurses. They 
state that this is to be expected to some eX- 
tent since the students are not screened i” 
most of the schools and the data were col- 
lected prior to the withdrawal of students 
not fitted to the program, 


The Present Study 


The writer’s study investigating the Pe 
sonality attributes of student nurses is also ĉ 
comparison between student nurses and , 
group of college women students majoring )” 
education curricula. A total of 86 women 
Students enrolled in the Knapp College oO 
Nursing at Santa Barbara, California, ™4 
up the experimental group, These 
matched for race, sex, age and percentile a 
the ACE, with an equal number of educatio 
majors at the University of California, 
Barbara College, There was one Japanes a 
each group, the remainder being Caucasi? y 
© age range for both groups was 17 t° sy 


years. Means for the groups for age a” 1. 
ACE Percentiles ar ea abl 


e presented in T i 
tudents were individually matched for 5 
variabl y e Wa 

es. In the majority of cases 28 


Table 1 


Matching Variables, 


Experimental and Control Groups 


N Age ACE Percentile 
/ kensa 
Experimental Group 86 Mesi SD Mean sD 
Control Group 86 15 34.4 22.6 
; 1.7 35.1 By 


374 


jn 


Personality Factors in Choice of Nursing 


also held constant or varied by not more than 
two years. Variation in ACE percentile points 
Was not greater than three points except for 
a very small number of cases. 

The group MMPI was administered to 
all students prior to entering college. The 
Score sheets were then analyzed to determine 
Whether a group of questions could be identi- 
fied which would differentiate one group of 
Students from the other. A total of 66 items 
were singled out, the critical ratio of the per- 
Centage difference being 2.00 or greater for 
each of the 66 items (23 of these items were 
Significant at the .01 level). These items 
Were broken down into four named categories 
and one miscellaneous group by the investi- 
Bator. These categories, which presumably 
identify personality characteristics of the stu- 
dent nurse as contrasted with those of educa- 
tion majors, are presented in Table 2. Also 


_ Biven are the item number of the Group 


MMPI, the ¢ value of the difference, and the 
answer characteristic of the nursing student. 
The first category of ten items, labeled “A 
Social-Sexual Factor,” is characterized by a 
Preference for the mannish and for masculine 
activities. The student nurse admits a pref- 
erence for association with her own sex; she 
likes the tall mannish woman. There is noth- 
‘ng of the feminine coquette in her make-up 
aor does she find pleasure in social dancing. 
Sldiering, reporting sports news, forestry 
Work, all have their appeal for her. 
he second group includes twelve items 
Seems to delineate a conventional adher- 
nce to custom and a prudish, decorous atti- 
e€ usually absent in today’s coed. She is 
atrassed by dirty stories. She avoids 
Sa Shows; she is not in on the gossip of the 
"OUP. She disapproves of women smoking 
ga imbibing alcoholic beverages. She does 
Sub; elieve that men are absorbed with the 
: “ect of sex. A person should be punished 
ay breaking the law, even if it is unreason- 
y. è Duty to a life goal inspires and moti- 
ates her, 
i Minimal psychosomatic concern is the label 
eas to the third group consisting of eleven 
E's. Two of these items have high t values. 
fv frst one, “The sight of blood neither 
“Bhtens me nor makes me sick,” with a $ 


a 


375 


value of 4.33, ranks highest in the total list 
with the exception of two items in the miscel- 
laneous group which refer to a liking for sci- 
ence. A ¢ value of 3.83 is obtained for the 
other item, “I sweat very easily even on cool 
days.” The student nurse does not worry 
about her health, does not dread seeing a doc- 
tor, can’t remember “playing sick,” does not 
feel tired. Symptoms of hypochondria are 
lacking. 

The responses by the student nurse to the 
items making up the fourth category, labeled 
“Freedom from Neuroticism,” are indicative 
of emotional stability. This category con- 
tains the largest number of items, twenty in 
all. The difference is significant at the .01 
level of confidence for seven items. The stu- 
dent nurse denies fear of people, the dark, 
and high places. Anxiety and tension are not 
part of her everyday experience. Home life 
is pleasant and few quarrels with members of 
her family are admitted. She makes no claim 
to personal importance, but states that she 
expects to succeed in the activities which she 
attempts. 

The miscellaneous category, including twelve 
items, presents some difficulties of interpreta- 
tion. Responses to some items, as the first 
two, referring to a liking for science and sci- 
ence reading, are easily understood on a com- 
mon sense basis. As previously stated these 
two items have the highest ¢ value of the en- 
tire list, being respectively 8.17 and 7.66. It 
seems fairly obvious that candidates for nurs- 
ing training would check such items as true. 
Another group of responses seems almost para- 
doxical in the light of previous interpreta- 
tions. Items 102, 461, and 417, all answered 
as true by the nurse, might be interpreted as 
contradictory of claims made for freedom 
from neuroticism. The writer has no ration- 
alization to offer. These items refer respec- 
tively to the hardest battles being with one- 
self, difficulty in setting aside a task once 
begun, and annoyance with anyone who tries 
to get ahead in line. Three items, 89, 199 
and 261, of rather low discrimination, seem 
to have little significance for this study as far 
as interpretation is concerned. Three other 
items, all answered false, have reference to 
occupational preference and membership in 


376 


Alma Perry Beaver 


Table 2 


The Significance of the Difference Between the Responses of 86 Student Nurses and 86 College Women 
in Education Curricula to 66 Categorized Items on the Group Form of the 
Minnesota Multiphasic Personality Inventory 


Marking 
MMPI t Characteristic 
Item Value of Student - 
Number of Difi. Nurses Categories and MMPI Questions 
A Social-Sexual Factor Characterized by a Preference for the 
Mannish and Masculine Activities 
514 3.00 True I like mannish women. 
69 2.60 True I am very strongly attracted by members of my own sex. 
435 2.00 True Usually I would prefer to work with women. 
391 2.00 False I love to go to dances. 
441 2.14 True I like tall women. 
144 2.00 True I would like to be a soldier. 
283 2.00 True If I were a reporter I would very much like to report sporting news: 
561 2.66 True I very much like horseback riding. 
81 2.14 True I think I would like the kind of work that a forest ranger does. 
208 2.00 False I like to flirt. 
A Conventional Altitude 
427 Qh True I am embarrassed by dirty stories. 
455 2.66 True Tam quite often not in on the talk and gossip of the group I belong t0: 
378 2.57 True T do not like to see women smoke. 
457 2.40 True I believe that a person should never taste an alcoholic drink. eas I 
232 2.28 True I have been inspired to a program of life based on duty which 
have since carefully followed. 
548 2.14 True I never attend a sexy show if I can avoid it. 
183 2.00 True I am against giving money to beggars. . 
485 2.00 False When a man is with a woman he is usually thinking about things 
related to her sex. t 
135 2.66 False If I could get into a movie without paying and be sure I was g 
seen I would probably do it. P 
456 3.00 False A person shouldn’t be punished for breaking a law that ja 0! 
" reasonable. 
be p j s 7 a easily downed in an argument. qd fun 
ike to go to nee and other affairs where there is lots of lou 
. Minimal Psychosomatic Concern 
i Sa The sight of blood neither frightens me nor makes me sick- 
aA 5 J I have never had a fainting spell. 
55 247 True i o toy pete other or asthma. in the heat 
36 2.14 True I seldom w one by pains over the chest or in t 
412 2.00 TRE Eann a orry about my health, 
363 3.83 False 5 oa ss a doctor about a sickness or injury- 
481 2.00 False I can remember tae ae ool days. hing: 
m oe False Iama high-strung oe. oi cami 
5 2.60 Fal: i : 
ies a we Ps a a good deal of the time. 
m neither — losing weight. 
137 3.20 True "reedom from Neuroticism op!® 
96 oy i Peba my home life is as pleasant as that of most P° 
3 True T 
32 4.00 False I oe — w ith members of my family 
73 3.50 False Tam an important ti ia on a task or job. 


Personality Factors in Choice of Nursing 377 


Table 2—Continued 


Marking 
MMPI t Characteristic 
Item Value of Student 
Number of Diff. Nurses Categories and MMPI Questions 
165 3.14 False I have several times had a change of heart about my life work. 
163 4.00 True I do not tire quickly. 
370 3.00 False I hate to have to rush when working. 
352 2.20 False I have been afraid of people or things that I knew could not hurt me. 
388 2.20 False I am afraid_toSbevalone_in the dark. 
166 2.00 False I am afraid when I look down from a high place. 
356 2.00 False I have more trouble concentrating than other people seem to have. 
480 2.00 False I am often afraid of the dark. 
351 2.66 False I am anxious and upset when I have to make a short trip away 
from home. 
534 2.00 True Several times I have been the last to give up trying to do a thing. 
13 2.20 False I work under a great deal of tension. 
345 2.20 False I often feel as if things were not real. 
271 2.00 True I do not blame a person for taking advantage of someone who lays 
p himself open to it. 
257 2.40 True I usually expect to succeed in things I do. 
426 2.20 True I have at times had to be rough with people who were rude or 
annoying. 
493 2.43 True I prefer work which requires close attention to work which allows 
me to be careless. 
Miscellaneous 
221 8.17 True I like science. 
552 7.66 True I like to read about science. 
102 2.29 True My hardest battles are with myself. 
461 3.29 True I find it hard to set aside a task that I have undertaken, even for a 
short time. 
417 3.20 True I am often so annoyed when someone tries to get ahead of me in a 
line of people that I speak to him about it. 
89 2.00 True It takes a lot of argument to convince most people of the truth. 
199 2.00 True Children should be taught all the main facts of sex. 
261 2.00 True If I were an artist I would like to draw flowers. 
204 3.83 False I would like to be a Journalist. 
í 428 rs False J like to read newspaper editorials. ; 
387 2,29 als The only miracles T know of are simply tricks that people play on 
false 
340 lse one another. several clubs or lodges. 
ad i ike to belong to several ch 
2.43 False I shout aa 
tlubs. g d value of 3.29 signifies high validity for this 
0p, | Milar j ue seeme f 
e e wae i means of validation, the 66 items were 


bec, Mto th A g ory. Is it 
“ty © Social-Sexual_ category culine Asa d on a single sheet and the stu- 


Coy © the ] itely mas a he 
| oth tation os eee r is some a. asked to recheck the list. This 
4 Er e ae or the student nurse © 7, dents were as htly different situation from 
nich wag factor operative? Lage resents aai e erat traces wien e 
aye ers to miracles as simply that obtaining ed through the total ag- 


hy 8d by vere tter 
Peo; as answeé « s were sca i 
So tte stude ple on one another, ferent to 66 item: of MMPI items. Furthermore, the 
€ s nt nurse may have t gregate w established in their training 


tony, SOrt 5 $ ich she en- sare D0 
by’ A box youthful idealism which —_ q cadets age i previously they were not 
’ the „Sarding possible miracles per rogram W 


; a jon, ‘ ission to the nursing school. 
Vine Members of the medical profess į certain of admis 


Perhaps being assumed. 


378 


Mean scores were computed for both the 
nursing and the college groups and compared 
with the original averages for the 66 items. 
In each instance, the average number of plus 
answers was lower than in the original tests, 
but the difference between the averages of the 
groups remained approximately the same. A 
further check was attempted when two groups 
of nursing students from the Bishop Johnson 
College of Nursing and the Hollywood Pres- 
byterian Hospital School of Nursing, both in 
Los Angeles, California, were asked to check 
the 66 items. The mean score of the latter 
group, from Hollywood Presbyterian School, 
was very close to that of the original mean 
score and higher than the mean retest score 
of the Knapp group. In the Bishop Johnson 
College group, the mean score was very close 
to that of the Knapp retest score. When the 
significance of the difference between the 
mean scores of the nurses and the college 
women was tested, CR’s were found to vary 
from 5.15 to 11.30. The lowest CR was ob- 
tained from a comparison of the mean scores 
of Santa Barbara students and Bishop John- 
son School nurses; the largest CR, 11.30, was 
obtained from a comparison of the mean 
scores of Santa Barbara students and the 
Knapp College students on the original test. 
Additional comparisons yielded scores inter- 
mediate between these extremes. These data 
are given in Table 3. The Pearson product 
reliability of the 66 item test as measured by 
the split-half, odd-even technique was .64. 


Alma Perry Beaver 


Summary 


1. An investigation into the personality at- 
tributes of student nurses as compared with 
education majors utilized the responses of the 
group MMPI as the basis for study. A total 
of 86 women students enrolled in the Knapp 
College of Nursing at Santa Barbara were 
matched for race, age and ACE aptitude per 
centile with an equal number of education 
majors at the University of California, Santa 
Barbara College. 

2. From the total number of MMPI re- 
sponses 66 items were singled out to make uP 
a scale which differentiated one group from 
the other. The criterion of selection was 2 t 
of 2.00 or greater. Of the 66 items, 23 were 
found to be significant at the .01 level of com” 
fidence. The odd-even, split-half technique 
yielded an r of .64 as the reliability of te 
66 point test. ZN 

3. The 66 point test gave mean scores which 
differentiated the two groups to approx: 
mately the same degree as did the origin® 
tests, although the absolute scores were lowe! 
in each case. The CR of the mean differens 
between the two groups in the original tes 
was 11.30. When the students were retestê 
with the 66 item test, the CR was 9.98. 
difference in mean scores between the tW° 
groups in the original test was 9.38 and in the 
retest 9.68. Two other groups of stude 
nurses in Los Angeles schools who were tes f 
on the 66 item test yielded average 8° 
similar to the experimental groups. The 


ye 
of the mean difference for these respe” 


Table 3 


Mean Scores for Sixty- 


E Gá six Items, Original and Retest Data A 
Original T j 
Student Groups N Me — cR 
1. Knapp College of Nursing* 6 an Sigma SEn 71.32 
2. Santa Barbara College* a 41.6 4.5 48 1 vs. 2 315 
3. Bishop Johnson School of Nursing 50 32.2 6.2 .67 Zos. 3 7.95 
4, Hollywood Presbyterian School of Nursing en a a A 
0 5.5 6 
5. Knapp College of Nursing Retest aw ea 
6. Santa Barbara College F 38.8 4.6 62 
29.1 4.4 i 
E a5 


* Sixty-six items embodied in total matrix of MMPI 


Personality Factors in Choice of Nursing 


groups when compared with the Santa Bar- 
bara College group were 5.15 and 7.95. 

4. The 66 items were broken down into 
four categories presumably identifying per- 
sonality attributes of the student nurse. 
These were labeled a Social-Sexual Factor, a 
Conventional Attitude, Minimal Psychoso- 
matic Concern, and Freedom from Neuroti- 
cism. An additional group of 12 items made 
up a miscellaneous category. 

5. The study offers evidence that the stu- 
dent nurse presents a significantly different 
pattern of response for a small number of 
Selected items on the group MMPI when 
Compared with a group of college education 
Majors. Presumptive evidence is furnished 
that the student nurse is a more stable in- 

_ dividual who exhibits a preference for her 
Own sex and likes mannish qualities in her 
associates. She is fastidious and conven- 
tional in her attitude and is duty inspired. 
Symptoms of hypochondria are lacking as is 


379 


evidence of neuroticism. These findings, in 
general, corroborate the findings of Elwood 
and Lough. Though both Lough and the 
writer used the group MMPI as the basis for 
analysis, the approach is somewhat different. 
Lough utilized the mean scores from the 
MMPI profile while the writer used the in- 
dividual item responses. 


Received January 9, 1953. 


References 


. Elwood, R. H. The role of personality traits in 
selecting a career; the nurse and the college 
girl. J. appl. Psychol., 1929, 1, 199-201. 

. Healy, I. and Borg, W. R. Personality character- 
istics of nursing school students and graduate 
nurses. J. appl. Psychol., 1951, 35, 265-280. 

. Lough, Orpha M. Women students in liberal arts, 
nursing and teacher training curricula and the 
MMPI. J. appl. Psychol., 1947, 31, 437-445. 

4. Lough, Orpha M. Correction for “Women stu- 

dents in liberal arts, nursing, and teacher , 

training curricula and the MMPI.” J. appl. 

Psychol., 1951, 35, 125-126. 


n 


w 


THE JOURNAL or APPLIED PSYCHOLOGY 
Vol. 37, No. 5, 1953 


Item Validity of the Lee-Thorpe Occupational Interest Inventory A 


Leopold Bridge and Meyer Morson 


Baltimore Regional Office, Veterans Administration» 


During the use of the Occupational Interest 
Inventory, Advanced for the identification of 
specific interest areas it was noticed that some 
of the test items did not appear to relate 
closely to the specific areas for which they 
were scored. This apparent discrepancy led 
the authors of the present article to explore 
the general concepts of validity underlying the 
construction of the Lee-Thorpe Inventory to 
determine whether their observations had any 
bearing on the usefulness of the test. 

A search of the available published ma- 
terial showed little relating to the validity of 
the Occupational Interest Inventory. Super 
wrote in 1949 (6) that he had located no 
studies of its validity; and no validity studies 
are reported in the Third Mental Measure- 
ments Yearbook (3). A study by McPhail 
(5) described the establishment of inter- 
est profiles for various occupational groups 
through the use of the Inventory but did not 
examine the validity of the test as a measur- 
ing instrument. 

No validity data are presented in the 
Manual of Directions (4) for the Inventory 
but the authors state that the observation of 
the following criteria has contributed to the 
validity of the tests: (1) the selection of the 
items; (2) the design or description of the 
items; (3) the balance of the items consti- 
tuting the Inventory; and (4) the presenta- 
tion of the items. 

The importance of these criteria is apparent 
from an inspection of the test. This study 
concerns only Part I which consists of 120 
pairs of items each member of a pair being 
identified by the authors with one of six Sie 
jor occupational fields designated by a letter 

1 This work was not performed i 
Veterans Administration activities 
volve VA records. The opinions e: 
of the authors and not necessarily 
erans Administration. | The authors wish to express 
their sincere appreciation for the Cooperation of Miss 
Banos of the Maryland Employment Ser 


Sprol of the Veterans Administration, and De Dr. 


illi Ter- 
williger of the Maryland State Vocational R i 
tion Division, and their respective staffs. ehabilita- 


n connection with 
and does not in- 
xpressed are those 
those of the Vet. 


380 


and a descriptive phrase: (A) Personal-Social 
(P-S.); (B) Natural (Nat.); (C) Mechani- 
cal (Mech.); (D) Business (Bus.); (E) The 
Arts (Ar.); and (F) The Sciences (Sci.)- 

Each major field contains 40 items each 
presumably descriptive of the respective field. 
It is, obviously, essential that extreme care 
be exercised in the selection of each test item 
when a total of 120 responses will result in ae 
termining the order of preference among 5!* 
areas of interest. 

Unless an activity is properly designated, k 
preference for it will distort two scores—the ^ 
field for which it is scored and the field t° 
Which it really belongs. In the event that the 
items are not properly representative of the 
Occupational fields for which they are score 
it would be expected that the testees WOU 
express disagreement with their inventor? 
interest patterns or with the relative rankings 
of their inventoried interests. e 

In two studies Brown (1, 2) compared es 
expressed and inventoried interests of Vete” | 
ans as measured by the Lee-Thorpe Occup | 
tional Interest Inventory and found that ther 
were significant differences, Although Brow? 
did not discuss the question it is possible a { 
Some of the discrepancy was due to the oe 
that items in the test were improperly de 
nated as to the occupational field. 


Method and Procedure 


In order to ¢ 
some of the L 


js that 


valuate the hypathe prop” 


ee-Thorpe items were no 

erly designated it re ee to analyze €? r 
item to see if it was assigned to the ee 
Occupational field. It was felt that the P a 
Sons best qualified to make such a deter”? 


. m 
br ha be those who had considera int 

ce in occupations either from the P ic, 
of view o Pations either f a 


f counseli i na 
: seling or occupational 2 
The items a ene 


of 38 
Cupati 
Empl 


u 
Were therefore reviewed by # we 
raters, including 18 Counselors °" tat? 
onal Analysts from the Maryland ” spi- 
Oyment Service with an average © 


— a; 


All Fields 240 133 


Item Validity of Lee-Thorpe Occupational Interest Inventory 381 
Table 1 
= Analysis of Dissent Scores for Major Occupational Fields 
a g r Items M 
Field of No. of Items Not No. of Dissate 
Interest Items Questioned Questioned Dissents Per Item 
P.-S. 40 14 26 165 4 
Nat. 40 26 14 341 9 
Mech, 40 30 10 596 15 
Bus. 40 12 28 76 2 
Ar. 40 23 17 305 8 
Sci. 40 28 12 346 9 
107 1829 8 


ence of 12 years, 15 Vocational Counselors 
from the Maryland State Rehabilitation Di- 
vision whose experience averaged 414 years, 
and 5 Vocational Advisers from the Veterans 
Administration with an average counseling ex- 
Perience of 6 years. 

Each person was asked to assign each of the 
240 test items to the Field of Interest in 
Which he felt that it belonged. The possi- 
bility of being influenced by the prior desig- 
hations of the test constructors was eliminated 
by securely masking the letter designations of 
the items printed in Part I of the Inventory. 

The responses of the 38 qualified raters 
Were scored in terms of dissents, i.e., assign- 
Ment of an item to a Field other than that 
designated by the authors. The few instances 
Where raters felt that an item did not fit into 
any of the six fields of interest were also 
Scored as dissents. On this basis the dissent 
Score for each of the 240 items could have any 
Value from 0 to 38. A score of O indicates 
that all the raters assigned the item to the 
Same field as the authors of the test while a 
Score of 38 indicated that none of the raters 
elt that the item belonged in the field to 


ich it had originally been assigned- 


Results 


Aft 
ang ;° tabulating the dissent se 
4 Occupational field a w1 
S Was apparent. Opinion 


cores by item 
de scattering 


n Scor d from 
ranged 1r 


total agreement to total disagreement with a 
general tendency to cluster at the extremes. 
It was found that some degree of dissent was 
found for 133 of the 240 items. The Fields 
which had the least number of questioned 
items were the Business and Personal-Social 
while the greatest number of challenged items 
were in the Mechanical and Scientific Fields: 

In view of the fact that dissent scores 
ranged from O to 38, it was felt that the 
intensity of the dissent per item should also 
be considered in evaluating the soundness of 
the various fields of interest. 

Table 1 indicates that the Business Field 
was considered as the soundest area in the 
test. In addition to having the fewest items 
questioned, it also has the lowest total num- 
ber of dissents. On this same basis the six 
fields of the Occupational Interest Inventory 
may be ranked from most valid to least valid 
as follows: 1. Business; 2. Personal-Social; 
3. Arts; 4. Natural; 5. Scientific; and 6. 
Mechanical. This order expresses the rela- 
tive degree to which the items in the test 
were considered to actually correspond to the 
field for which they are scored and which they 
purport to measure. 

Further analysis of the dissent scores can 
he used for an evaluation of the individual 
item validities. On the basis of these scores 
the test items can be divided into three major 


categories: 
item in the same occupational field as 


"0 Dissents .......000000 0 rs. 
EET the ae creement as to the proper placement of 
: r _ . Some > i justify positive i 
OW Dissent Score sareret" the item but too slight to justify positive assertion 


that item is im 


properly placed. 


382 


3. Significant Dissent 


Leopold Bridge and Meyer Morson 


eR arse sé Raters believe that item does not belong in field 


assigned by authors. 


A. High Agreement among raters 


- -When there is a high degree of agreement among 


the dissenting raters the item may be considered as 
sound but belonging in a field other than that as- 
signed by the authors. 


B. Disagreement among raters .. 


. Item is not sound because there is no general agree- 


ment as to the field to which it should be assigned. 


As has been previously stated there are 107 
items (out of 240) concerning which the raters 
are in complete accord with the authors of the 
test. Further, if we assume that a dissent score 
of 12 or less (at least 66% agreement with the 
authors) indicates a satisfactorily classified 
item, we find that 76 additional items or a 
total of 183 should be considered as assigned 
to the proper occupational field. In other 
words, the raters agree that 76% of the ques- 
tions in Part I of the Occupational Inventory 
meet the necessary criteria of validity for in- 

„clusion in the test. 

There are 24 items with dissent scores of 
26 or more which indicates substantial dis- 
agreement as to the authors’ classification of 
the items but where, in addition, there are 
significant agreements among the raters as to 
the proper placements of the items. Of these 
items 11 are now classified in the Mechanical 
Field, 4 each in Arts and Sciences, 3 in Natu- 
ral and 2 in Personal-Social. As a result of 
the raters’ revisions, 5 should be in the Per- 
sonal-Social Field, 3 in the Natural, 3 in the 
Mechanical, 3 in the Business, 4 in the Arts, 
and 6 in the Sciences. The following exam- 
ples are characteristic of the items that the 
raters scored as being in improper categories: 


Occupational Field 
Assigned by 
Authors Raters Item as Shown in Inventory 

Nat. Sci. “Plan experiments to con- 
trol worms, insects and 
other pests.” 

Ar. P.-S “Teach people how to im- 
prove their manners and 
poise.” 

Sci. Mech. “Clean and recharge storage 
batteries.” 

Mech. Art. “Paint signs on windows or 
do lettering on posters 

; with brush or pen,” 

Mech. Sel 


“Experiment with the mak- 
ing of synthetic Products 
such as artificial teeth’ 
nylon or cellophane,” ’ 


Up to this point the discussion has ac- 
counted for 207 items. The balance consists 
of the 33 controversial items where no sub- 
stantial number of the raters could agree aS 
to the proper occupational designation. The 
following items are characteristic of this group 
(the numbers in parentheses indicate the num- 
ber of raters preferring the occupational fie 
designated) ; 


Occupational Field 5 
Assigned by Item as Shown ™ 
Authors Raters Inventory 
P.-S. P.-S. (23) “Take care of the oo 
Bus. (15) respondence and Pe 
vate affairs wot a 
other person. y- 
Nat. Nat. (9) “Direct the quick-free® 
Sci. (9) ing or dehydration 
Bus. (19) farm products. 
Mech, ( 1) nd 
Mech. Mech. ( 5) “Label bottles, sort 4% 
Bus. (22) wrap fruit, oF pA 
P-S. (5) eggs.” 
= f 1) 
Nat. 5 i 
Art. Art. (3) “Mow lawns; he 
P-S. (12) hedges and bus 
. Nat. (23) trim trees.” gols 
Sci. Sci. (14) “Keep a doctora jn 
P.-S. (20) and equipmen 
Mech. ( 4) order.” 


: ; ; ill i 
A careful inspection of these items will d 


P ] si 
dicate the basis on which the raters P% se 


their opinion. defer a 


or? 
ways by persons taking the test and thes ity 


as set forth in the Manual. pe 
co 
A study of the results of this survey stud! 


vi 
volved a comparison of expressed an Th otf 
toried interests as shown by the Lee- pe a 
Occupational Interest Inventory (2). com 


SS SS ag Esser. 


} 


Item Validity of Lee-Thorpe Occupational Interest Inventory 


of the veterans felt that their expressed inter- 
ests corresponded with the relative ranking 
of their interests as shown by the test. He 
found that the greatest dissent was found in 
connection with scores in the Mechanical 
Field with a bias toward the belief that the 
Scores were too low. A review of the test 
Items by a group of raters experienced in the 
fields of vocational counseling, placement and 
Job analysis reveals that 11 of the 40 items 
now classified as “Mechanical” actually be- 
long in other fields but that only three items 
otherwise classified should be considered as 
“Mechanical.” This leaves the test with only 


_ 32 items in this field so that in many in- 


stances mechanical interests may be inade- 
quately measured and may account for the 
expressed dissatisfaction as found by Brown. 
The results of the present study indicate 
that room exists for further detailed work on 
the selection of items for inclusion in the In- 
ventory. The problem of selecting items that 
can be considered as belonging exclusively to 
One field is difficult but not impossible as 
shown by the number of items on which the 
raters were in complete accord with the 
authors, It is essential, however, to see that 
the items do not contain activities or elements 
that belong to more than one field. It would 
e even more important to demonstrate that 
the items also do, in fact, discriminate be- 
tween occupational groups a la Strong. 


Summary and Conclusions 


1. The 240 items of Part I of the Lee- 

horpe Occupational Interest Inventory, Ad- 
vanced were analyzed in terms of agreement 
of raters with the occupational fields for 
Which they are scored. 


383 


2. The analysis was performed by 38 raters 
with extensive backgrounds in vocational 
counseling and occupational analysis. 

3. Scoring was done in terms of dissents, 
i.e., disagreement with the authors’ classifica- 
tion of the items. 

4. Raters were in complete accord with the 
authors of the Inventory on 107 items and in 
substantial agreement on 76 more items. 

5. Raters felt that 24 items were in occu- 
pational fields other than those in which they 
are now assigned, the Mechanical Field being 
least reliable with 11 out of 40 items con- 
sidered to be improperly classified. 

6. No substantial agreement was reached as 
to the proper occupational classification of 33 
more items. 

7. Since the validity of 57 of the 240 items 
is questionable, caution should be used in the 
interpretation of the interest pattern obtained 
through use of the Inventory. 


Received December 10, 1952. 


References 


Expressed and inventoried inter- 


1. Brown, M. N. 
J. appl. Psychol., 1951, 35, 


ests of veterans. 
401-402. 

2. Brown, M. N. Evaluation of Lee-Thorpe in- 
ventory ratings by veteran patients. Educ. 
psychol. Measmt., 1951, 11, 248-254, 

3. Buros, O. K. Third mental measurements year- 
book. New Brunswick, Rutgers University 
Press, 1949. 

4. Lee, E. A. and Thorpe, L. P. Manual of direc- 
tions, Occupational Interest Inventory, ad- 
vanced series. Los Angeles, California Test 
Bureau. Copyright 1943 (but containing ref- 
erences to test changes made in 1946). 

5. McPhail, A. H. Interest patterns for certain oc- 
cupational groups: Occupational Interest Inven- 
tory (Lee-Thorpe). Educ. psychol. Measmt., 
1952, 12, 79-89. 

6. Super, D. E. Appraising vocational fitness. New 
York: Harpers, 1949. 


THE JOURNAL OF APPLIED PSYCHOLOGY 
Vol. 37, No. 5, 1953 


Scalability and Validity of the Socio-economic Status Items of 
the Purdue Opinion Panel 


H. H. Remmers and R. Bruce Kirk 


Purdue University 


In each of the polls of the Purdue Opinion 
Panel a brief scale is included to measure the 
socio-economic status of the respondents in 
a nationally representative sample of high 
school youth numbering from 8,000 to 18,000. 
The purposes, scope and details of the opera- 
tion of Panel have been described elsewhere 
(1, 2, 5, 6). The items in this brief scale 
were originally taken from Tke American 
Home Scale as the items with highest validity 
(4). They have been slightly revised on oc- 
casion because of obsolescence of an item. 
For example, an item asking whether any 
member of the family had been on relief was 
valid in.1940, but obviously not in 1953, The 
items used in the present study are as follows, 

House and Home: Answ 
tions by checking “yes” or “n 
below. 

“Does your family have: 


er these ques- 
o” in the space 


A. a vacuum cleaner? Yes... No... 
B. an electric or gas 

refrigerator? Yes... No. 
C. a bathtub or a shower 

with running water? Yes... No 
D. a telephone? Yes... No... 
E. an automobile? Yes... No... 
F. Have you had paid 

lessons in dancing, 

dramatics, expression, 

elocution, art, or music 

outside of school? Yes... No...” 


The Problem 


The problems of the 
(1) testing the unidim 
items by means of the G 
bility (3); and (2) 
scale, 


Present study were: 
ensionality of these 
suttman test of scala- 
testing the validity of the 


Procedure 


Two independent random 
respondents’ records were dra 
available sample of about 1 


samples of 100 
wn from a total 
0,000 by taking 


384 


every nth individual record. Each sample of 
100 was then tested by means of the Guttman 
technique (3), once by restricting cut-off 
points to score boundaries, and again by the 
“ideal” method. 

All pupils in the two samples reported at 
least one of the items. Possession of an auto- 
mobile did not, however, distinguish anywhere 
along the line, thus showing it to be a non- 
discriminating item, Quite possibly the make, 
year and model of the automobile might be 
found to be relevant to such an index of socio- 
economic Status, but this information was not 
in hand. The other items scale satisfactorily 


by Guttman’s criterion of 90 per cent repro- 
ducibility, 


Validity of the Scale 


Validity may be variously defined but pe! 
haps its most acceptable meaning is in terms 
of the prediction of a criterion, It is esse?” 
tially in this sense that we use the term here- 
The data are taken from Purdue Opinion 
Panel Poll Reports in the form of statistically 
reliable differences between the low and the 
high status groups as defined by the scale: 
All differences reported here are at the 
Per cent level of confidence or better. he 
critical ratios range from 2.6 to 9.5. ics 
Stratified-random sample is usually betwer 
2,000 and 3,000 respondents with from a 

5 per cent in the high socio-economic stå w 
group and from 75 to 80 per cent in the lo 
group, im 

In Poll No. 21 jn 1949 + te students a. 
rey Sample were asked to check their £ 
Te iial Problems in a list of 300 such PO's 
A items, The following items in Ta jch 

» 7 matize the breakdown of responses W ot’ 
yielded reliable differences, i.e., reliable © 


1 Results of 
SRA Youth 
Slates, 57 


T mrg 
Pas Manual give 


item- i 
test Correlations, norms, etc, 


— N 


Scalability and Validity of Socio-economic Status Items 385 
Table 1 
Percentage Differences between Low and High Socio-economic Status Groups on Problems 
that Predominate in the Lower Group 
Note: all differences reliable at the 1% level of confidence or better. 
Socio-economic 
Status 
Item Low High 
No. N=1809 N=0646 Difference 
19. I would like to have more vocational courses. 31 25 6 
38. What shall I do after high school? 51 37 14 
42. Should I go to college? 36 26 10 
46. I can’t afford college. 24 9 15 
51. I must select a vocation that doesn’t require 
college. 16 6 10 
62. What jobs are open to high school graduates? 45 25 20 
63. How do I go about finding a job? 37 30 7 
95. I feel that I’m not as smart as other people. 36 27 9 
101. I get stage fright when I speak before a group. 55 45 10 
123. [wish I could carry on a pleasant conversation. 36 28 8 
131. T want to learn to dance. 35 25 10 
140. There aren’t enough places for wholesome rec- 
reation where I live. 46 34 12 
155. I can’t find a part-time job to earn spending 
money. 30 20 10 
157. I have no quiet place where T can study at 
home. 16 10 6 
158. I can’t get along with my brothers and sisters. 18 12 6 
182. I wish I had my own room. 20 12 8 
256. My teeth need attention. 19 7 12 


relations between socio-economic status as 
Measured and the incidence of problems 
checked by the low and the high socio-eco- 
nomic status groups. The items are the 39 
that yielded such differences. 

Beyond demonstrating validity of the status 
Scale, the results summarized in Table 1 also 
give something of a qualitative picture of the 
Kinds of preoccupations, worries and prob- 
lems which characterize the low status group. 
By way of contrast it is of interest to examine 
the items that yield significantly higher pro- 
Portions of responses from the high status 
group. They are shown in Table 2. 

It is of interest to note that, if we ue n 
averages of Low and High groups 1 e ing 
and 2 as indices of the amount ony ie 
amo i country as 4 whole, 
. Ng teen-agers in the worry bout 

's evident that the two groups 


€ same amount. They differ sharply, how- 
SV6b, aiy the kinds of worries that they A 


Summary and Conclusions 


The Guttman test of scalability was ap- 
plied to two independent random samples 
drawn from a total sample of approximately 
10,000 high school pupils’ responses. Validity 
of the scale was investigated in terms of sig- 
nificant differences between items in the SRA 
Youth Inventory and the socio-economic 
status index used in the Purdue Opinion 
Panel. The data support the following con- 
clusions. 

1. The items of the Socio-economic Status 
Index are scalable and represent substantially 
a unidimensional scale. 

2. The scale is valid in that it correlates 
significantly with individual problems re- 
ported by a national sample of high school 
pupils. 

3. The two groups into which the respond- 
ents are divided have about the same amount 


386 H. H. Remmers and R. Bruce Kirk 


Table 2 


Pi tage Differences between Low and High Socio-economic Status Groups on Problems 
a go that Predominate in the Upper Group 


Note: all differences reliable at the 1% level of confidence or better. 


Socio-economic 


Status 
No N=1300 N= bis Difference 
9. I would like to take courses that are not n 6 
offered in my school. 31 3 7 s 
11. I have too much homework. 19 25 $ 
39. For what work am I best suited? 54 60 P 
40. How much ability do I actually have? 58 64 
41. I want to know more about what people do z 
in college. 35 42 5 
44, How shall I select a college? 34 49 15 
126. I want to make new friends. 49 55 ó 
132. I have a desire to feel important to society or 
to my own group. 19 27 8 
152. Td like to know how to become a leader in 
my group. 21 27 6 
154. I have difficulty budgeting my time, 19 28 9 
169. I want to be accepted as a responsible person 
by my parents. 18 24 6 
254. I don’t get enough sleep. 14 20 6 
277. How can I help get rid of intolerance? 12 23 11 
278. How can I help to make the world a better 
place in which to live? 27 39 12 
279. What can I do about the injustice all around 
us? 13 23 10 
281. I’m worried about the next war. 29 37 8 
282. Is there something I can do about race preju- 
dice? 21 39 ie 
283. Is there any way of eliminating slums? 21 32 11 
284. What can I do to help get better government? 13 20 7 
285. How can I learn to use my leisure time wisely? 23 30 7 
288. What can I contribute to civilization? 10 17 7 
a 7 
298. I wonder about the after life, 20 29 2 


of worries, but these are 


qualitatively very 
different for the two groups 


Received December 19, 1952. 


References 


1. Gage, N. L. Scaling 
ion poll analysis. 
tion LXI; Further 


and factorial design in opin- 

Studies in Higher Educa- 

Studies in Attitudes. Se- 

ries X. Division of Educational Refere: 
Purdue University, December, 1948, 

2. Gage, N. L. and Remmers, H. H, Opinion polling 


with mark-sensed punch cards, J, appl. Psy- 
chol., 1948, 32, 88-91, 


nce, 


5. Remmers, 


- Guttman, Te. ai 


he Cornell technique for s€ 


ale 


. h. 
and intensity analysis, Educ. psychol. Meas! 


ence Research 
and to be repub; 


ates, Chicago 


cago, Ilinois.) 


so opinio” 
5 as Na Measuring the public opi : 
orrow. Indiana Teacher, May 194 


6. Remmers, H. H. 


of tom 
281 


young People, 
300, 


1947, 7, 247-279, 


(Originally published by 


Institute of Technology; 


The Purdue Opinion Po! 
Scientific mon., 1945, 60, 


}inoiss 
- li- 
lished by Psychometric Aff i- 


‘hi 


1 for 


2927 


Tue Journat or ArpLED PsycHoLocy 
Vol. 37, No. 5, 1953 


The Relation of Motivation and Skill to Active and Passive 
Participation in the Group * 


Ben Willerman 


Student Counseling Bureau, Office of the Dean of Students, University of Minnesota 


In any group there is variation in the extent 
of participation by different members in the 
activities of the group with some members 
more active than others. Among other fac- 
tors, it is likely that differences in the per- 
sonal characteristics of the members will help 
to explain the variation in extent of par- 
ticipation. 

Many voluntary organizations have both a 
social and an organizational aspect with two 
Corresponding areas in which group members 
may participate. Participation in the more 
purely social activities of the group may be 
quite different from working to recruit mem- 
bers, planning and arranging meetings, etc. 
This study is concerned with the latter type of 
Participation. 

The purpose of this study is to explore two 
general hypotheses concerning the differences 
between active members (AM) and passive 
members (PM): 

1. AM as compared with PM are more 
Motivated to participate in organizational ac- 
tivities, become more involved in the organiza- 
tion and derive more satisfaction from the 
Organization. These differences are due, in 
Part, to the possession to a greater degree by 
the AM group of personality characteristics 
which dispose them to become interested in 
and participate in groups. ‘ 

2. AM more than PM have abilities which 
are probably related to effective action in or- 
ganizations. In this study, it is assumed that 
the skills required to perform organizational 


functions are largely verbal and are ade- 
. This report is one of a series of research studies 
M student life being conducted by the Office of the 
ean of Students, University of Minnesota. Various 
Staff members gave helpful advice and assistance with 
this study. The author is grateful to Jack Laugon, 
St. Olaf College, formerly with the Student Activi- 
ties Bureau, Office of the Dean of Students, Univer- 
ae Minnesota, for his assistance in a pilot phase 
Of this investigation. 
his study a supported in large part by a re- 
Search grant from the Graduate School of the Uni- 


Versity of Minnesota. 


387 


quately measured by an academic aptitude 
test. 
Method 


A questionnaire was given to 19 of the 20 
academic sororities on the campus of the Uni- 
versity of Minnesota. These questionnaires 
were completed by approximately 90% of the 
entire sorority membership. Two questions 
were used to select the subjects. The ques- 
tion used to select the AM of the sorority was, 
“List the names of the members of your so- 
rority who would be a real loss to the sorority 
if they became inactive.” The question used 
to select the PM was, “List the names of the 
members of your sorority who do not seem to 
have much interest in the sorority.” To be 
included in the sample as an AM, a girl had 
to be selected by at least one-third of the re- 
sponding members of her sorority. The cor- 
responding criterion for PM was 10% or 
more. 

In order to control two variables which were 
correlates of active and passive participation 
but not relevant to this study, some members 
of the sample were eliminated: (1) Girls liv- 
ing in the house were excluded because they 
were more frequent in the AM group and not 
frequent enough in the PM group to warrant 
separate analysis. The sample, therefore, con- 
sists entirely of town girls who generally com- 
mute from their homes to the University. 
(2) Girls who were members less than one 
year were excluded to eliminate the effects of 
a short membership period. This procedure 
left 41 AM and 37 PM. Almost all of the AM 
held an important sorority office. Few of the 
PM held an office. This outcome suggests 
that our dimension of active and passive mem- 
bership is similar to but not identical with 
the leadership-followership dimension. The 
followership classification does not exclude 
active members who are not leaders nor does 
the leadership classification necessarily in- 
clude active members who contribute to the 


388 Ben Willerman 


group but not as leaders. Bird (1) and Stog- 
dill (5) have excellent summaries of differ- 
ences between leaders and non-leaders. 

The questionnaire contained items related 
to the member’s satisfactions and dissatisfac- 
tions with her sorority. Five-point rating 
scales elicited self-estimates of importance to 
the group, participation in the group, feelings 
of group belongingness, satisfaction with and 
acceptance of group decisions. Friendship 
choices and extra-sorority activities, as well 
as some background and demographical data 
were also obtained. 

For many of the girls, test scores were on 
file at the Student Counseling Bureau. The 
tests included the Minnesota Multiphasic Per- 
sonality Inventory (MMPI), the ACE Psy- 
chological Examination, and high school rank 
(HSR). 

Chi? is the statistical test of significance 
used most frequently here and when used the 
distribution is split at the median. 


Results 


The AM derived more satisfaction, in gen- 
eral, from their membership than the PM, 
The first source of evidence for this result as 
well as for the “validation” of the selection 
method comes from five self-rating scales. 
Table 1 shows that the AM more than the 
PM believe the group regards them as im- 


Table 1 


Self-Ratings of Participation, Perceived Importance to 
Group, Group Belongingness, Agreement with 
Actions of Group, and Acceptance of Group 
Decisions when in Disagreement* 


Active Passive 
Members Members 
(N=41) (N=37) 
% v 
Participate More or Much More 
in Sorority Activities than 
Most Members 93 5 
Important and Very Important 71 14 
Real Part of Sorority 98 62 
Agree Most of Time with Actions 
of Group 73 30 
Complete Acceptance of Group 
Decisions 49 11 


* All of these differe 


nces are significant well hey, 
the 5% level by Chi?. "el beta 


portant, and consider themselves as partici- 
pating in the sororities’ activities more than 
most members. A larger proportion of AM 
express feelings of group belongingness, are 
satisfied with, and agree most of the time with 
the group’s decisions. 

The second type of evidence relates more 
to the sources of satisfaction and dissatisfac- 
tion than to the amount of satisfaction. The 
free answers to the questions, “What do you 
like most . . . ?” and “What do you least like 
about your sorority?” were coded using cate- 
gories defined largely in terms of the actual 
answers. Coding agreement between two 
coders based upon 64 questionnaires for the 
“like most” answers was almost 75% and for 
the “like least” answers, based upon 34 cases; 
was about 80%. Since coding in some M- 
dividual categories was very unreliable, tO 
conserve space no table of these results 15 
presented and only outstanding differences 
will be discussed. , 

Two main differences occur. The AM gitls 
like the “group spirit, cooperation, a 
unity” of their sorority more than the Pd 
(37% vs. 5%, Chit, P<.05), The AM also 
more frequently like least the “lack of inter- 
est or cooperation of some members” (51% 
vs. 5%, P <.05). The PM more frequently 
like least the “compulsory functions” (16% 
vs. 0%, Fisher’s exact test for 2 X 2 tables, 
P<.05). Although the difference is Sif 
nificant at only the 10% level, the PM more 
frequently complain that the sorority takes 
too much of their time. We may infer from 
these differences that the AM derive their 
satisfactions and dissatisfactions from the at 
tainments or frustrations of the organization@ 
goals, while the PM are less oriented this way 
and seem to regard some features of the 5°. 
= as an interference with their perso”? 

e. 

, One specific hypothesis concerning the Jonet 
terest of the PM in the sorority organiza 
a U as Suggested by the staff member a 
n ete Student Counseling an the 
MMP rae of their experience wita in 
male companion; ad An intense intere 
Ship tended to hav até 
MMPI a relatively high psychopathic devia” be 
a relatively low masculi"! ns 


(Pd) score and 
femininity (Mf) score (low on this scale m°” 


Relation of Motivation and Skill to Active and Passive Participation 


more feminine). It was reasoned that this 
type of attraction to persons outside of the 
sorority would be associated with lowered 
interest in the sorority. On this hypothesis 
the discrepancy scores between the Pd and 
Mf scales were compared for the two groups 
with the expectation that the discrepancy 
scores for the PM would be greater than those 
for the AM. The results together with some 
other test scores are shown in Table 2. While 
the difference between the two groups is not 
significant at the conventional 5% level, the 
t test (two-tailed) is at approximately the 9% 
level. There is, then, the possibility that one 
of the causes of low participation in the so- 
rority group is the strong interest in men. 
Another interpretation is that when irresponsi- 
bility or non-conformity (high Pd) is coupled 
with a low interest in organizational functions 
(with which femininity may be correlated), 
the result is a girl neither adept at nor inter- 
ested in the tasks of maintaining an organiza- 
tion, 

If active participation in a group is the 
result.of more than just situational circum- 
stances, we should expect AM to be more ac- 
tive in other organizations. While we do not 
have a direct measure of amount of time or 
effort devoted to other organizations we do 
have evidence concerning the number of mem- 
berships held. This measure which is prob- 
ably related to active participations shows 
that the AM belong to more extra-sorority 
groups (Chi*, P < .01). 


389 


The greater participation of the AM in 
other organizations does not seem to be limited 
to the formal aspects of participation. In 
response to the question asking who their best 
friends at the University were, the AM men- 
tioned 2.00 and the PM mentioned 1.09 girls 
on the average who were not members of the 
sorority (Chi*, P < .05). 

Logically, passive participation may result 
from either low motivation in the direction of 
participation or, if motivation is present, from 
counter-tendencies which oppose participa- 
tion. The latter may be labelled “restraints” 
against participating. A plausible type of 
restraint in social situations is “fear of failure” 
or lack of self-confidence. To test the possi- 
bility that the PM possess characteristics 
which block participation, the K scores of the 
MMPI were compared for the two groups. 
The justification for the use of K as a measure 
of self-confidence rests upon an inspection of 
the items, many of which have “face validity,” 
and upon the opinion of some counselors who 
believe that K often reflects genuine self-con- 
fidence rather than defensiveness. The re- 
sults are consistent with the above reasoning. 
The mean K score for the PM is significantly 
lower than for the AM. 

Another condition which may prevent an 
individual from contributing to a group is 
lack of skills demanded by organizational 
tasks. In a sorority the organizational tasks 
seem to require a relatively high level of ab- 
stract ability and skill in communicating. A 


Table 2 


Scores on HSR, ACE, 


and MMPT* for Active and Passive Members 


Active Members 


Passive Members 


———————— - 
Measure N Meant SD N Meant SD i P 
HSR 37 83.9 16.2 36 64.6 25.9 3.97} <.01 
z 7 4.9 25.6 35 44.7 25.6 3.34 <.01 
ACE 37 64. Sra 
K 31 60.3 7.9 26 53.4 7.7 3.34 <.01 
Pd 31 52.8 8.5 26 55.8 7.9 1.36 >.10 
Mf 31 «48.8 7.0 2% 42 99 114 >.10 
Pa-ME 314d 110 2% 96 130 175 <10>.05 
ia i. ‘ P § 5 2 5 
Sa i35 84 72 16§ 55.0 10.2 385 <1 


* Data of other scales of MMPI not reported beca 

Mean percentile scores for HSR and ACE; mean 

t Critical ratio was used because 0 

_.., È The N’s for this scale are fewer th 
Scale was constructed. 


an for other sca 


use they were not regarded as relevant to this study. 
T scores for MMPI scale. 


f unequal variances. ph 
les because some subjects’ tests were scored before the Sie 


390 


girl low in verbal ability might either be dis- 
couraged from participating by her fellow 
members or impose restraints upon herself be- 
cause of fear of failure. Another possibility 
is that the necessity for maintaining a mini- 
mum grade average would leave a girl of low 
academic ability little time for organizational 
activities. 

Although we do not have sufficient evidence 
to choose among these alternative hypotheses, 
a comparison of the mean scores of the ACE 
(total) indicates that skill as a determining 
factor is an effective variable. Table 2 shows 
that a difference of 20 points in ACE exists 
between the two groups. In addition, the 
HSR, a measure of past academic achieve- 
ment, gives similar results. 

Since many of the girls in the PM group 
had high ACE scores and many had low Pd- 
Mf discrepancy Scores, the question was raised 
as to why these girls were Passive participants, 
The answer to this question may be found in 
the correlation between - these two sets of 
scores which for the PM is y = 4S (N = 25, 
P < .05). Thus, the more a PM has an in- 
dicator related to active participation (high 
ACE), the more she tends to have an indica- 
tor which is related to passive participation 

(high Pd-Mf discrepancy), 

For the AM group the variance of ACE 
scores is lower and moreover no such rela- 
tionship was predicted, The corresponding 
r of — .22 is not significant. 

The large and Statistically significant differ- 
ences between the two groups on the Social 
Introversion-Extroversion scale of the MMPI 
is perhaps further validation for the scale 
(2, 3, 4) and reinforces the hypothesis of the 


existence of personal factors Producing differ- 
ences in participation, 


Summary and Conclusions 
The purpose of this 
hypotheses that active 
tion in the organization: 
was related to motiva 
particular group, 
oriented toward 
to skill in perfor: 
organization. 
The nomination tech 
select 41 girls who. were 


Study was to test the 
and passive Participa- 
al functions of a grou 
tion to belong to the 
to general tendencies to .be 
participating in groups 
ming the tasks required by the 


nique was used to 
active members and 


Ben Willerman 


37 girls who were passive members from a 
total of 19 college sororities. Self-rating 
scales, sociometrics and test scores provided 
the basic data. 

1. The active members were more attracted 
to their groups and seemed to derive satisfac- 
tions and dissatisfactions more from the or- 
ganizational features of the sorority than the 
Passive members. 

2. The active members belonged to more 
student organizations and had more friends 
outside the sorority, indicating a general tend- 
ency to be attracted to organizations and 
to be socially inclined more than the passive 
members. 

3. Using the discrepancy score of the Pd 
minus Mf scales of the MMPI as an indicator 
of strong interest in men which detracts from 
affiliation with the sorority, the passive mem- 
bers turned out to have higher scores. How- 
ever, an alternative hypothesis that this aa 
crepancy score represents a combination © 
non-conformity (Pd) and absence of interest 
in organizations (Mf) is also plausible. 

4. The lower K scores of the MMPI for 
the passive members are interpreted as the 
Presence of lack of confidence which operates 
as a “restraint” against participating. 

5. As measured by the Sie scale of the 
MMPI, the passive members are definitely 
more introverted, indicating a general ten j 
ency for personal factors determining pê" 
ticipation in a particular group. i 

6. The active members were 20 pe 
higher than the passive members on both ti 
ACE and HSR, As indirect measures of 2P a 
tude and ability for organizational tasks ma 
differences lend support to the hypothesis t 
skill is an important factor in participation. 


Received December 3, 1952. 


References cen- 
1. Bird, C. Social psychology. New York: 
tury, 1940, MPL 
2. Drake, L, E, A Social I. E. scale for the M 


; J. appl, Psychol., 1946, 30, 51-54. 

>- Drake, L. E, and Thiede, W, B. Further 
tion of the Social T, E. scale for the 
J. educ, Res., 1948, 41, 551-556. 

4. Gough, H, A research note on the MMPI 14}: 
L E. scale. J. educ. Res, 1949, 43, 138 with 

- Stogdill, R, M. Personal factors associate e J 
leadership: A survey of the literatur 
Psychol., 1948, 25, 35-71. 


valida” 


pl. 


socia! 


wn 


a A OS 
io” ame — as > 


Tue Journat or Appen Psycuonocy 
Vol. 37, No. 5, 1953 


Studies in Social Interaction: III. Effect of Variation in One 
Partner’s Prestige on the Interaction of Observer Pairs * 


Bernard Mausner 


University of Massachusetts 


It is a basic assumption of contemporary 
advertising and politics that people’s judg- 
ments can be swayed by the opinions of in- 
dividuals with high prestige. Clear-cut proof 
or disproof of this assumption in field situa- 
tions is difficult. For example, an attempt to 
test the effects of endorsements by sports 
heroes on the sale of cigarettes would prob- 
ably be beset by many problems. It may be 
of interest therefore to test a parallel assump- 
tion in the laboratory. The effect of one in- 
dividual’s judgments on those of another has 
been studied extensively through the use of 
an experimental design first employed by 
Sherif (9). In this, judgments are made first 
by each $ alone, then in a group situation. 
Some of this work has indicated that the de- 
8ree to which any one S will abandon his own 
Judgment range for that of a partner will de- 
Pend, in part, on the characteristics of that 
Partner (3, 4, 5). 

he present study was designed to test the 
effect of variation in a partner’s prestige on 
Social interaction in such an experimental de- 
sign. Art judgments were chosen because 
these can be demonstrated to be stable in the 
absence of social stimulation, and because the 
Manipulation of prestige in the area of art is 
relatively simple. The following experimental 
YPothesis was tested: Ss will be influenced 
More by the judgments of a partner with high 
Prestige than by those of one with low 

Prestige, 

Method 


A group of forty undergraduate students in 
Washington fae College was given the All- 
Pott-Vernon Scale of Values. From these 
Were chosen three groups of ten Ss maien 

r their percentile scores on the scale o 
aesthetic value (1). All of these Ss were 


* Former] hington Square College, New 
Kork University. The weiter wishes to acknowledge 


in the con- 

e assist: k Celentano in t 
duct fe ne Me Ted Climis played te 
oles of “fellow student” and “art director. m 


then given the Meier Art Judgment Test in- 
dividually. Except for the omission of in- 
formation concerning differences between the 
paired plates, standard instructions were fol- 
lowed. In a second session one week later the 
Ss were told that they were to repeat the test 
in order to determine its reliability. Ss in 
Group I (control) repeated the test substan- 
tially as in the first situation. Ss in Groups 
II and III were told that, to save time, the 
test would be given to pairs. A confederate 
of the experimenter was the second member of 
each pair. He was introduced to Ss in Group 
II as a fellow-student, to Ss in Group IIT as 
the art director of a local advertising agency 
interested in the results of the test. 

In the together situation both members of 
the pair judged each of the pairs of pictures; 
the S was first in all even trials, the confeder- 
ate in all odd trials. The confederate had 
memorized the test; he consistently stated the 
preference indicated as “wrong” by the scor- 
ing key. 

Results 


The degree of social influence is expressed 
in terms of the change in the number of 
wrong preferences from the first to the second 
session. Table 1 gives mean wrong judgments 


Table 1 


Mean Wrong Choices on the Meier Art Judgment Test 
M, Gives Results for Pretest My: Alone for Group 
I, with “Fellow Student” for Group II, and 
with “Art Authority” for Group ITI 


Note: S first on even trials, partner first on odd. 


Group I Group II Group IIT 
Odd Even Odd Even Odd Even 
Ma 14.6 13.6 133 13.7 10.9 16.1 
Mp 14.8 13.9 17.0 14.5 19.6 18.2 
tats w EL 3.9 53 13.0 1.3 
p >8 >8 <I S <.001 >.2 
N 10 10 10 10 10 10 


392 Bernard Mausner 


for both sessions for each group with the odd 
and even trials treated separately. Also in- 
cluded are ż’s for the differences between mean 
wrong judgments alone and with a partner.’ 

Group I showed no significant change in 
the frequency of wrong judgments from situa- 
tion one to situation two. This would be an- 
ticipated from the high reliabilities reported 
for this test: coefficients of reliability range 
from .70 to .£85 (7). Groups II and III 
showed no significant change in the frequency 
of wrong judgments for the even trials where 
S made his choice first. It can be assumed 
therefore that there was no tendency for as- 
sociation with a partner of such deplorable 
taste to affect the general ability of S to dis- 
criminate good from bad pictures. However, 
in the odd trials, where the partner made his 
choice first, both Groups II and III demon- 
strated an increase in the frequency of wrong 
judgments (cf. Table 1). The differences are 
significant at the .01 level for Group II, at far 
better than the .001 level for Group III. 

An examination of the data for individual 
Ss reveals that in Group II two of the ten Ss 
showed fewer wrong responses in the second 
situation than in the first. All of the Ss in 
Group III gave more wrong responses in the 
social than in the individual situations. In- 
terpretation of differences between Groups II 
and III is justified to the extent that they are 
drawn from a relatively homogeneous popula- 
tion, and were matched for aesthetic values. 
The mean increase in Wrong responses is sig- 
nificantly greater in Group III than in Group 
II: Mpu = — 3.8, Mon = — 7.2,'= 1.70, 
n = 9, p approaches .05 when the difference is 
evaluated in terms of a one-tailed hypothesis, 
This hypothesis is considered valid because 
the experimental hypothesis presented did not 
involve the probability of decrease in wrong 
responses, and because it was predicted that 
Group III would show more shift than Group 
II. Certainly the difference 


between a ¢ of 
3.9 for Group II and 13.0 for 


Group III would 


1For detailed table showing frequenc 
judgments by each S in each situation order Docu- 
ment 3917 from ADI Auxiliary Publications Project. 
Photoduplication Service, Library of Congress, Wash. 
ington 25, D. C., remitting $1.25 for microfilm 
(images 1 inch high on standard 35 mm. motion pic- 


ture film) or $1.25 for photoprint readable without 
optical aid. 


y of wrong 


indicate that the changes should be more re- 
liable for the latter even though there is no 
way of comparing the two ¢ values quantita- 
tively. These results demonstrate that the 
judgments of a partner affected the responses 
of Ss taking the Meier Art Judgment Test, 
and that Ss tended to converge more con- 
sistently towards a partner with high prestige 
(art director, Group III) than towards one 
with little prestige (fellow student, Group II). 


Discussion 


The writer has suggested (6) that the group 
judgment situation may be considered to 
create conflict for S between a tendency to 
continue giving his prior judgments, and one 
to agree with his partner. The findings of the 
present study indicate that the latter tendency 
may be affected by the partner’s prestige- 
This does not deny the role of other factors in 
determining the extent of group influence. 
The effect of expert opinion on other kinds of 
judgment has been extensively investigate 
with, unfortunately, no consistent results; is 
Some work “experts” were less effective us 
shifting judgment than “ordinary people, 
in others more effective (8, pp. 946-980): 
The differences among these reports may be 
due on the one hand to unreported or ue 
analyzed differences in other determinants ° 
interaction: the stimulus factor, the nature 
of the response, the personalities of the 5S 
themselves. On the other hand, this variatio” 
in the reported results may be due to an 1° 
adequate specification of the nature of eT 
tige (2). The teacher may be prestigeful } 
Some areas, not in others. The “expert” may 
exert maximum influence only when his ps 
Pertness” is accepted by S as genuine. P 
tige, thus, might have to be measured in ter™ 
of influence on judgment. er. 

` One Possible way of avoiding this gher 
larity would be to attempt to relate varn 
independent determinants of prestige to 
Sree of convergence between members ° de 
coacting pair of Ss. These could inclu 
Status Jn a hierarchy, group membership» 
past history of contact between Ss. I” "5 
Present study Prestige is varied by means j 
instructions to § regarding his partner’s ote 
membership (expert ys. non-expert)- 


N 


J 


Studies in Social Interaction: III 393 


results indicate at least that under relatively 
controlled conditions it is possible to produce 
consistent variation in the degree of social in- 
teraction as a function of prestige manipulated 
in this manner. In further investigations at- 
tempts will be made to vary the prestige fac- 
tor more extensively through variation in 
other determinants. 

From a practical point of view, the present 
findings, with an obvious extrapolation, sup- 
port the reliance on “expert testimonial” 
which has long been accepted practice in busi- 
hess and politics. However, the above dis- 
cussion indicates that a caution is necessary. 
It is not always possible as yet to predict 
when an externally labelled authority will ac- 
tually be accepted as an authority and will be 
able to exercise appreciable influence on judg- 
ment. 

Summary and Conclusions 


In a test of the effect of variation in one 
partner’s prestige on the interaction of ob- 
server pairs three groups of ten Ss, equated 
for interest in art by means of the Allport- 
Vernon Scale of Values, were given the Meier 
Art Judgment Test. Ss in Group I repeated 
the test alone; Ss in Group II and Group III 
repeated it with a partner. He was introduced 
to Group II as a fellow student, to Group Ill 
Ss as an “art authority.” The partner in all 
Cases made choices indicated as wrong by the 
Scoring key. ; 

Degree of social influence was measured in 
terms of the shift in frequency of wrong judg- 
ments from the “alone” to the “social” situa- 
tion. Group I (control) showed no significant 
shift in mean number of wrong judgments. 


Both Groups II and III showed an increase 
significant for Group II at the .01 level, for 
Group III beyond the .001 level. Comparison 
of the two groups using a one-tailed test of 
significance shows Group III (art authority) 
giving a greater increase of wrong judgments 
than Group II (fellow student). This differ- 
ence approaches significance at the .05 level. 

It is concluded that the judgments of Ss 
taking the Meier Art Judgment Test were af- 
fected by the responses of coacting partners, 
and that this effect was a positive function of 
the partner’s prestige. 


Received November 21, 1952. 


References 


1. Allport, G. W., Vernon, P. E., and Lindzey, G. 
A study oj values. Cambridge: Houghton 
Mifflin, 1951. 

2. Asch, S. E. The doctrine of suggestion, prestige, 
and imitation in social psychology. Psychol. 
Rev., 1948, 55, 250-277. 

3. Berenda, Ruth W. The influence of the group on 
the judgments of children. New York: King’s 
Crown Press, 1950. 

4. Bovard, E. W., Jr. Group structure and percep- 

tion. J. abnorm. soc. Psychol., 1951, 46, 398- 
405. 

. Bray, D. W. The prediction of behavior from 
two attitude scales. J. abnorm. soc. Psychol., 
1950, 45, 64-84. 

6. Mausner, B. Studies in social interaction: I, A 

conceptual scheme. J. soc, Psychol., in press. 

7. Meier, N. C. The Meier art tests. I. Art judg- 
ment, Examiner’s manual. Iowa City: State 
University of Iowa, 1942. 

8. Murphy, G., Murphy, L. B., and Newcomb, T, 
Experimental social psychology. (Rev. Ed.) 
New York: Harpers, 1937. 

9, Sherif, M. A study of some social factors in per- 
ception. Arch. Psychol., 1935, No. 187. 


n 


Tue JOURNAL OF APPLIED PSYCHOLOGY 
Vol. 37, No. 5, 1953 


Methods of Conducting Critiques of Group Problem-Solving 
i Performance 


E. Paul Torrance 


Human Resources Research 


The purpose of this study is to evaluate 
the relative effectiveness of four alternative 
methods for conducting brief critiques of a 
short problem-solving exercise designed to 
assist groups (air crews) to function more 
effectively as groups. 3 

In many training situations, both military 
and civilian, it is necessary to conduct brief 
on-the-spot critiques of a group’s performance. 
Instructors of the Advanced Strategic Air 
Command Survival School, the scene of the 
present study, are faced with this problem 
many times during the course of the field 
training of each crew they instruct. In all of 
these situations, there is the problem of how 
much guidance by the instructor or expert 
produces the best results, Can a crew effec- 
tively criticize itself and im 
solving performance, or is t 


Theoretical Considerations 


Much has been written in the areas of coun- 
seling and guidance and industrial training 
about techniques applied to the individual to 
bring about proper evaluation and improved 
adjustment or performance. One set of con- 
siderations deals with the locus of evaluation. 
One group, of which Rogers is the chief 
spokesman, holds that only when the locus 
of evaluation is in the individual does real 
growth and development take place (20), 
According to this theory, an evaluation by an 
expert or an evaluation resulting from a test 
would remove the locus of evaluation from 
the individual and would not result in de- 
velopment and growth. 


Essentially the same 
theory is represented in the work of Cantor 


(1, 2), Maier (14), Lippitt (12), French 
(4), Katzell (10), Haas (5), and others, 
If one were to apply this theory to the 


394 


` formance do, 


Laboratories Detachment, Stead Air Force Base, Nevada 


problem of critiques, the superior method 
would be expected to be one in which the 


leader assumes a non-evaluative role and: 


stimulates the group to evaluate its perform- 
ance and discover improved methods. 

A second set of considerations centers about 
the role of group decision in changing be- 
havior. Recent findings in industrial research 
and nutritive education research (6, 8, 11) 
indicate that group discussion as such results 
in very little change in behavior, while group 
decision as a component of group discussion 
brings about considerable change. In these 
experiments, scientifically developed informa- 
tion was given by the expert as it was needed 
but the decision was left to the group. Haire 
(6) points out, however, that group decision 
does not work with passive or apathetic 
groups, although its use almost always stimu- 
lates a desire for participation and eventually 
changes the apathy. 
number of experiments have explored 
situations and leadership techniques which set 
up resistance or retard growth, and others 
which win acceptance or stimulate growth. 
The problems of resistance have been treated 
by Zander (22), Torrance (20) and Coch and 
French (3). All emphasize the importance 0 
respecting the individuals or groups involved: 

variety of methods are discussed by Maier 
(14, 15, 16), Cantor (1, 2), Haas (5), Haire 
(6), Lippitt (12), and Rogers (18). There 
Seems to be agreement that improved pe! 

l eS not result merely throug 
reading or hearing lectures. More active 
participation methods, such as through dis- 
te and role playing procedures, are 1° 
_ The skill of the leader must also be CO?” 
sidered as a factor. A series of experiment? 
conducted by Maier (17, p. 170) showed ee 
a leader, if skilled and Possessing ideas, © 
conduct a discussion so as to obtain a quality 
of problem-solving that surpasses that © 


A 


Critiques of Group Problem-Solving Performance 395 


group working with a less skilled leader and 
without creative ideas. Further, he can ob- 
tain a higher degree of acceptance than a less 
skilled person.” 

Maier concludes, however, that “even an 
unskilled leader can achieve good quality solu- 
tions and a high degree of acceptance” using 
democratic leadership. In another experi- 
ment (16), he demonstrated the superiority 
of the permissive discussion leader over the 
self-critique discussion with an observer pres- 
ent. Maier maintains that the major part of 
the difference was due to the relatively greater 
influence exerted by individuals with minority 
opinions in the “leader” groups than in the 
“observer” groups. “A discussion leader can 
function to up-grade the group’s thinking by 
permitting an individual with a minority 
opinion time for discussion” (16, p. 287). 


Method and Procedure 


Subjects. The subjects of the experiment were 
57 combat air crews undergoing training at the 
Strategic Air Command’s Advanced Survival 
School at Stead Air Force Base, Nevada. Most 
of these crews were B-29 (11 men) crews, but a 
few B-50 (10 men) and B-36 (usually about i 
men) crews were also included. Most of the 
crews had been functioning as crews for about 
four months, although some had been together 
for two or more years. 

Problem-Solving Exercises. Two of the paar 
lectual Talents Tests (401-B and 701-X) devel- 
oped by the Human Resources Research Labora- 
tories were used. Both tests are thought to up 
common-sense judgment and are alike in as 
each presents the examinee with pee 
tions too complex for solution by any er 
Step logical reasoning process an ea tes a 
examinee to select the most essentia Ta or 
critical of the many elements presentet Be 
Problem-situation. The problem-situations jie 
Tather commonplace and can be soe a 
basis of knowledge gained from bac age 
Periences common in most persons lives. MAT 
ences in the 401-B and We mumber of shorter 

1-X ists of a la um 
Bribie A pii an unlimited number of 
choices, ged 

Experimental Procedures. TS To 
tested in tents measuring 16 Saat. he cae 
the first day of their training. Bach ten re 
first given an orientation regar ing agen 
and purpose of the test. Following ít ie 
Member of the crew Was asked ne aL teat 
mate of his crew’s performance, e jest piob- 
lem-solving test was then administered, 


which a post-test estimate of crew performance 
was obtained from each crew member. 

Following this, a critique of the first problem- 
solving performance was conducted by one of the 
following methods: 

1. Unstructured non-authoritarian or crew-cen- 
tered critique: The crew was asked to evaluate 
and discuss its own performance. Discussion was 
centered on both the decision as to method and 
the way it was reached, as well as the way the 
decision was executed. The experimenter tried 
to stimulate discussion and encourage crew mem- 
bers to evaluate their performance, but the ex- 
perimenter did not evaluate their performance. 
The experimenter accepted questions but referred 
them back to the crew. The attitude of the 
experimenter was definitely non-authoritarian. 
Techniques used were similar to those described 
by Cantor (1, 2), Maier (14), and Rogers (19). 

2. Directive or expert critique: The experi- 
menter diagnosed the performance of the crew 
according to a set of 13 rating scales (listed 
later), pointed out ineffective procedures, and 
suggested ways of improvement. He stated that 
through research, certain characteristics have 
been found to differentiate between crews which 
operate effectively and those which do not. The 
analysis included both the way the group went 
about making its decision and what they decided, 
as well as how they worked together to carry out 
the decision. 

The experimenter took a very active role, as- 
suming the role of the “expert.” He tried, how- 
ever, to give his advice in the most tactful way 
possible. He, nonetheless, gave definite evalua- 
tions and advice. The experimenter accepted 
questions and answered them as an “expert.” 

3. No critique: The experimenter went ahead 
and administered the California F-Scale which 
required about 15 minutes, before administering 
the second problem-solving test. 

4. Self-critique: Time was allotted for a critique 
and the experimenter left the tent, returning after 
15 minutes. 

5. Structured non-authoritarian or crew-cen- 
tered critique: The experimenter used the set of 
rating scales as a guide in getting the crew to 
evaluate itself and discover more effective ways 
of performing. The locus of evaluation was still 
within the crew, however. 

Following the 15 minute critique period, the 
second problem-solving test, the 701-X, was ad- 
ministered. The rules were the same as for ‘the 
first problem except that the time limit was ten 
minutes. 3 

Observations and Ratings. After each of the 
two problem-solving tests, the experimenters com- 
pleted a set of five-point rating scales following 
a set of descriptive scales on each of the follow- 
ing characteristics: (1) Organization of man- 
power; (2) Selective use of personnel; (3) Su- 
pervision; (4) Participation in decision-making; 


396 E. Paul 


cceptance of suggestions or criticisms; (6) 
ennn of available time; (7) Checking 
work; (8) Leadership function; (9) Survey of 
the situation; (10) Understanding instructions; 
(11) Group atmosphere; (12) Speed of reaction 
to the problem situation; and (13) Officer-air- 
men relations. 


Results 


A problem-solving score was computed for 
each crew on both of the problem-solving 
tests, using the scoring formulae already in 
use for these tests. A performance rating was 
also computed for each crew on both of the 
problem-solving situations by adding the 
thirteen ratings made by the examiner. In 
order to hold constant scores and ratings for 
the first problem-solving test and to determine 
if the variance in scores and ratings is due to 
the method of conducting the critique, analy- 
ses of co-variance were then carried out both 
for ratings and for scores, Using the ratings, 
the variance for critique methods was found 
to be statistically significant at the one per 
cent level of confidence (F = 4.968). Using 
problem-solving Scores, however, the variance 
was not statistically significant at less than 
the five per cent level of confidence (F 
= 1.957). Because of the small number of 
crews critiqued by each experimenter by each 
method, it was not possible to compute 


the interaction of experimenter and critique 
method. 


Crews participating in the unstructured 
non-authoritarian critique were combined with 
those participating in the self-critique and 
crews participating in the expert critique were 
combined with those participating in the struc- 
tured non-authoritarian critique in order to 
study the effect of structure vs, non-structure 


Torrance 


in critiques. Analysis of co-variance revealed 
that the variance due to structure is significant 
at the five per cent level both for ratings (F 
= 5.664) and for scores (F = 5.124). Analy- 
sis of co-variance also showed that the vari- 
ance due to different experimenters is not 
statistically significant (F = 0.429) for rat- 
ings and for scores. ‘ 

In order to study relative improvement in 
performance which might be attributable to 
differences in methods of conducting critiques, 
each crew was ranked in order from one to 
fifty-seven on each of the four variables (score 
on 401-B, score on 701-X, ratings on 401-B 
performance, ratings on 701-X performance). 
Crews were then divided equally into a most 
improvement category and a least improve- 
ment category on ratings and on scores. 
Table 1 shows the percentage falling into each 
category according to method of conducting 
the critique for both ratings and scores. 

The t-test of significance of differences in 
Percentage reveals the superiority of the ex- 
pert critique over the non-authoritarian cri- 
tique (significant at the .001 level of con- 
fidence), no critique (significant at the -0! 
level), and the self-critique (significant at the 
02 level). The differences in percentages Be 
tween the expert critique and the structure 
non-authoritarian critique is not statistically 
significant. The latter tends to be more fre- 
quently followed by improvement than are 
the unstructured non-authoritarian critique 
(significant at the .01 level of confidence), ° 
critique (significant at about the .10 level a 
confidence), and the self-critique (not statis 
tically significant), 


iain T x n 
€ situation in regard to improvement 0 


s s 
problem-solving scores is about the same # 
Table 1 
Comparison of Effectiveness of Methods of Conducting Critiques 
Structured ji = : 
s Ui tured 
Expert Aon Aah "E Self Non-Authori No 
A à Critique Criti elf- tarian “sque 
f ritique itique iti Critiq! 
Basis of Comparison (l1 crews) (44 crews) dz aaue ume, (12 crews 
Percentage showing “most improvement” crew 
in standing on scores 73 "9 m 
Percentage showing “most improvement” 33 36 
in standing on ratings 91 
64 5 33 
50 9 


Critiques of Group Problem-Solving Performance 397 


for ratings, except that the superiority of the 
expert critique is not as clear. The é-test of 
significance of the difference in percentage 
shows that the expert and structured non- 
authoritarian methods are superior to the un- 
structured non-authoritarian, the self-critique 
and no critique at about the .02 level of con- 
fidence. The unstructured non-authoritarian 
method and the self-critique appear to have 


| NO superiority over no critique. 


Discussion 


The fact that the structured non-authori- 
tarian is superior to the unstructured non- 
authoritarian method and that the expert 
method is not superior to the structured non- 
authoritarian method would suggest that the 
locus of evaluation is not important in the 
type of critique studied in this experiment. 
Of course, it may be that even though the 
“expert” makes evaluations, the crew still 
makes its own evaluations and does not sur- 
render its evaluative function to the expert 
as readily as some might suppose. A close 
examination of crews subjected to the expert 
method and making little improvement m- 
dicates that some of the evaluations given by 
the “expert” were definitely rejected by the 
crew. The crucial thing may be the giving of 
evaluations that can be accepted rather than 
the giving or not giving of evaluations. 

The issue of group decisions does not be- 
come crucial in this experiment since in every 
case the decision was left to the crew, al- 
though that decision may have been made by 
One person, usually the aircraft commander. 
In using the unstructured ga sce ate e 
however, it was observed by almost all of the 
experimenters that a crew would recognize 
and discuss improved solutions and even ap 
Pear to give general approval to ue p i 
tions, Yet, when the time came to decide b 
to organize for the second problem, the a 
craft Commander would simply say, “We'll 
do it the same way we did the other one” 
This may explain why this method Lea 
More effective than no critique of any kind. e 

In regard to the overcoming of potas 
the less structured methods are least elfectiv' : 
t must be mentioned, however, that te 
the crews which made the most outstanding 


improvement were crews using the self-cri- 
tique. The difficulty is that not all crews are 
able to look objectively at their performance 
and discover more effective ways of working 
together. Most crews seem to require enough 
structure or guidance to assure that their 
evaluations and considerations will be con- 
cerned with the salient elements. This does 
not in any way deny the importance of the 
participation and involvement of the group. 
It does, however, emphasize the importance of 
the “expert” and the nature of the role he 
must play in order to be effective where single 
trial, immediate performance is concerned. 
Although the variation due to experimenter 
differences was not significant, differences in 
the success of experimenters were observed. 
For example, 70 per cent of the crews cri- 
tiqued by two of the experimenters were in the 
“most improvement” category while only 25 
per cent of the crews of another experimenter 
were in this category (significantly different 
at about the 5 per cent level of confidence). 
The least well trained experimenter differed 
very little from the best trained experimenters. 
The results would appear to have important 
implications for training of many types, espe- 
cially training of the on-the-job variety in 
industry, education, and the military services. 
Although there are a number of questions 
which need to be subjected to further study, 
the results of this study seem to point the way 
to using structured critiques where decisions 
are still left to the group, where final evalua- 
tion is left to the group, but where the trainer 
can help guide the evaluative process. This 
study also suggests several directions for 
further research which are being pursued 
through a series of additional studies now 
under way. These studies are concerned with 
the role of the expert, the decision-making 
techniques of the group’s usual leader, spread 
of learning within the group, and transfer of 
learning to more different situations. 


Summary 


A total of 57 combat air crews undergoing 
survival training were divided randomly into 
four experimental groups and one control 
group. Each experimental group was admin- 
istered a problem-solving test, critiqued ac- 


398 E. Paul 
cording to one of four methods, and then ad- 
ministered a second problem-solving test. The 
control group was given no critique between 
the two problem-solving tests. 

Crews obtained scores on both of the prob- 
lem-solving tests and ratings of manner of 
performance on both of the tests. 

Analysis of covariance indicates statis- 
tically significant variances in ratings due to 
method of conducting critiques. Analysis of 
covariance indicates Statistically significant 

variance in both scores and ratings due to 
structuring the critique but no statistically 
significant variance due to experimenters. 

Crews critiqued according to the more 
highly structured methods are more fre- 
quently followed by “greater improvement” 
than are crews critiqued according to the less 
highly structured methods, Crews participat- 
ing in the unstructured non-authoritarian and 
the self-critique do not perform significantly 
better than crews receiving no critique. 


Received November 24, 1952, 


References 


1. Cantor, N. The dynamics of learning, Buffalo, 
N. Y.: Foster and Stewart, 1946, 

2. Cantor, N. Learning through discussion, 
falo, N. Y.: Human Relations for Ind 
1951. 

3. Coch, L. and French, J. R. P, Overcoming re- 
sistance to change. Human Relat., 1948, it 
512-532. 

4. French, J. R. P; Field experiments: changing 
group productivity. In J. G. Miller (Ed.), 
Experiments in social process. New York: 
McGraw-Hill, 1950. 

5. Haas, R.B. Action counseling and 
sis; a psychodramatic approach. 
Monogr., 1948, No. 25. 

6. Haire, M. Some problems of indust 
J. soc. Issues, 1948, 4, 41-47. 


Buf- 
ustry, 


process analy- 
Psychodrama 


rial training. 


Torrance 


10. 


11. 


12. 


13, 


14. 


15, 


16. M 


22. 


- Hendry, C. E., Lippitt, R., and Zander, A. Be- 
ality practice as educational method. Psycho- 
drama Monogr., 1947, No. 9. 


.-Hendry, C. E. A decade of group work. New 
York: Association Press, 1948. y aft 
- Human Resources Research Laboratories. Intel 


lectual Talents Tests 701-X and 401-B. Wash- 
ington, D. C.: HRRL, Bolling Air Force Base. 

Katzell, R. A. Testing a training program in Hi 
man relations. Personnel Psychol, 1948, 1, 
319-329, 

Lewin, K. Group decision and social change. 
In T. M. Newcomb and E. L, Hartley (Eds.), 
Readings in social psychology. New York: 
Henry Holt and Co., 1947. t 

Lippitt, R. An experimental study of the effec! 
of democratic and authoritarian atmospheres. 
Univ. of Iowa Studies in Child Welfare, 1940, 
16, 43-195, . 

Lippitt, R. Training in community relations. 
New York: Harper, 1949. 2 

Maier, N. R. F. Principles of human relations. 
New York: John Wiley, 1952. 

Maier, N. R. F, and Zerfoss, L. F. MRP: @ 
technique for training large groups of anpii 

visors and its potential use in social research. 

Human Relat., 1952, 5, 177-186. x 

aier, N. R. F, and Solem, A, R. The contribu 

tion of a discussion leader to the quality 0" 

group thinking: the effective use of minority 

opinions. Human Relat., 1952, 5, 277-288. 

Maier, N. R, F. The quality of group decisions 
as influenced by the discussion leader. Hu- 
man Relat., 1950, 3, 155-174, š 

Rogers, C. R. Client-centered therapy. Boston: 
Houghton Mifflin Co., 1951, f 

Rogers, C. R. Divergent trends in methods ° 
improving adjustment. Harvard Educ. Rev» 
1948, 38, 209-219, 

The phenomenon of resistan 

+ abnorm, soc. Psychol. 1950s 


Torrance, 
in a te 


Press, 
Zander, A, R 


esistance to change—its analysis 
and preven 
9-11, 


tion. Advanc, Mgmt, 1930, 15 


) 


Tue JouryaL or AppLiED PsycuoLocy 
Vol. 37, No. 5, 1953 


Logical Reasoning: With and Without Training * 


William J. Morgan and Antonia Bell Morgan 


Aptitude Associates, Merrifield, Virginia 


It is very strange indeed that psychologists 
have paid so little attention to problems of 
logical reasoning. During the last 50 years 
no systematic and comprehensive approach 
has been undertaken by them toward these 
problems. We made a careful search of the 
literature since 1927 and found 21 references 
to experimental studies of logical reasoning, 
and we were rather generous in our interpre- 
tation of what constitutes an experimental 
study. Generally speaking, therefore, psy- 
chologists have been contributing less than 
one experimental study per year on prob- 
lems of logical reasoning. But they con- 
tinue to speculate and philosophize, not quite 
So often as the philosophers themselves, on 
the characteristics of this mental process. It 
is difficult to understand why supposedly 
hard-bitten, scientifically-minded psycholo- 
gists have given so little attention to this prob- 
lem. Perhaps in their failure to exploit the 
findings of Stérring (7, 8, 9) and Eidens (2), 
they became discouraged. Psychologists seem 
to be under the delusion that logical reason- 
ing is confined to the syllogism, a view which 

as long been abandoned by the logicians 
themselves. : 

It may also be that psychologists have not 

een willing to undertake experimental studies 
of logical reasoning, because they were, for 
Such a long time, desperately trying to divorce 
themselves from the influence of philosophy, 
and, of course, logic is an integral discipline 
of philosophy. Whatever may be the pas 
for the paucity of experimental ee 
gical reasoning, it seems to be a fact t at 
Mathematicians rather than psychologists En 
Concerning themselves with logic. _in spite 
of the stress laid on logical reasoning, Spe 
cially the deductive aspects, by ee 
Clark Hull in his establishment = ig 
game or sport, 
how the game 


ks This paper was present 
y ology, of the American, 
La cement of Science at its & 

Suis, Missouri, 30 December 1952- 


will be played. But unlike chess, or poker, 
or basketball, the rules of logic are basic to 
science. As H. M. Johnson, himself a psy- 
chologist, has said (3, p. 74) “No artful 
manipulation of symbols according to pre- 
scribed rules can make good logic out of bad 
logic . . . . The structure of science as we 
know it is predetermined by the definitions, 
postulates, and rules of manipulation of sym- 
bols that we call modern logic. This logic 
includes the whole of the traditional or Aristo- 
telian logic, cleared of certain well known 
defects; it includes a great deal that Aristotle 
and his imitators overlooked. . . . we may be 
sure that if any procedure assumes equivoca- 
tion, affirming the consequent, denying the 
antecedent to be valid, then it does not yield 
a set of rules for ‘scientific inference.’ ” 

In view of the absence of too much experi- 
mentation on logic by psychologists, it is not 
surprising to find cropping up some rather 
far-fetched notions about the nature of logi- 
cal reasoning. In the chapter on Speech and 
Language in Stevens’ Handbook of Experi- 
mental Psychology (4, p. 806), Professor G. 
A. Miller says: “The fact is that logic is a 
formal system, just as arithmetic is a formal 
system, and to expect untrained subjects to 
think logically is much the same as to expect 
preschool children to know. the multiplication 
table.” 

This sentence, an argument by analogy, 
contains a number of interesting implications 
which might be subjected to analysis but we 
shall restrict ourselves to the assertion con- 
tained therein that untrained subjects can- 
not be expected to think logically. 

When we use the word “logic” we accept the 
definition given by Warren’s Dictionary of 
Psychology (10) where logic is defined as the 
“principles that enable an individual to make 
judgments or conclusions which are consistent 
with the data at hand.” 


Subjects and Procedures 


The Morgan Test of Logical Reasoning (5) 
was administered to 134 adults, all employed 


399 


400 William J. Morgan and Antonia Bell Morgan 


by the United States Government. This test, 
which was first developed in 1946 for the test- 
ing of superior adults, contains 75 true-false 
items in verbal form. The scoring formula 
is Right minus Wrong. The subjects in this 
study were allowed 30 minutes. These are a 
few sample items from the test: 


(a) All highly successful businessmen are prac- 
tical psychologists. Therefore, some practical 
psychologists are highly successful businessmen. 

(b) Most executives are college graduates. 
The majority of executives are Republicans. 
Therefore, most college graduates are Repub- 
licans. 

(c) If we rearm Germany, the French will op- 
pose us, and if we fail to maintain air bases in 
East Anglia we shall incur the resentment of the 
British. But it is essential to retain the good 
will of either France or Britain. Therefore, we 
must maintain our East Anglian air bases or else 
abandon plans for the rearmament of Germany. 

(d) No person interested in treating human 
ailments has failed to study Professor Pavlov’s 
book on the nature of the digestive juices—a 
book that won the Nobel prize. No person who 
has failed to study Professor Pavlov’s book is a 
physician. Therefore, although they may have 
other interests, it can be said that all physicians 
are interested in treating human ailments, 

(e) Many women are high-strung and emo- 
tional. A high-strung and emotional tempera- 
ment is frequently a barrier to clear and logical 
reasoning. Therefore, many women are unable 
to reason logically. 

(f) You can fool some of t 
time. You can fool all the pe 
time. Therefore, you cannot f 
all the time. 


he people all the 
ople some of the 
ool all the people 


The subjects consisted of two 
and WOL) of 67 each. All subjects in Group 
WL (With Logic) had had at least three 
semester hours of college training in logic. 
No subjects in Group WOL (Without Logic) 
had had any training in logic. Each person 
in Group WL was paired in terms of Sex, age, 
and college degree(s) with a person in Group 
WOL.: In each group there were 58 males 
and 9 females with a mean age of 27 years 


groups (WL 


1The pairs were matched in ter 
achievement as measured by college degrees, rather 
than in terms of scholastic ability. However, for 61 
cases in the group with logical training and for 65 
cases in the group without logical training we have 
statistics derived from the Verbal Intelligence Test 
(published by Aptitude Associates, Merrifield, Va., 
copyright 1948). This test is scored by summating 
the rights, and the maximum score is 50. The mean 
score for Group WL was 38.3 (SD, 7.7); and the 
mean for Group WOL was 32.1 (SD, 9.5). The 
Critical Ratio (D/ep) was 4.0. This diff 
significant at the one per cent level, 


ms of educational 


erence js 


and a standard deviation of 5.0. The oldest 
was 42, youngest 20, median 27. There were 
in each group 43 with a Bachelor’s, 16 with 
a Master’s, and 8 with an LL.B. degree (plus 
the Bachelor’s). In addition to Groups WL 
and WOL, there were 9 subjects with a Ph.D. 
degree, 7 males and 2 females. The oldest 
was 57, the youngest 26, the median 33, and 
the mean age 36 years. None of the Ph.D.’ 
had had any training in logic. 


Results 


The lowest score in Group WL was —2, the 
highest 67, mean 29.1, and the standard devia- 
tion 14.0. The lowest score in Group WOL 
was — 7, the highest 48, mean 21.2, and the 
standard deviation 11.2. The means were 
significantly different beyond the 1% level. 

By comparing the mean score for Group 
WL with the mean score for Group WOL, it 38 
found that Group WOL did 73 per cent a 
well as Group WL. Since this test is score 
Right minus Wrong, if a person is ge 
throughout, he should get a zero score, a 
other things being equal. If a person does n° 
know how to reason logically, he would have 
to guess on this test on every item, and v 
would expect him to get a zero score. Bu 
What do we find? Instead of a zero scores 
these college graduates who did not have the 
benefit of formal training in logic were ac 
tually able to achieve a mean score of 2!- 
compared with a mean score of 29.1 for those 
Who had had training in logic. In other wO" 
they did 73% as well as the group which h@ 
received training in logic. This is a far € 
from zero, : 

Although Group WL obtained a high 
mean score on the test than Group WOL, it? 
remarkable that of the LL.B.’s, 7 of the 8 WPO 
had not received training in logic obtain® 
higher scores than their paired partners; Q 
the Master’s, 6 of the 16 who had not sit 
ceived training in logic did better on the te 
than their paired partners; of the Bachelor $» 
13 of the 43 who had not received training i 
logic obtained higher scores than their OP” 
posite numbers in the WL Group. In a 
Words, 26 of the 67 subjects, i.e., 38%, in as 
WOL Group did better than their paired pa” 
ners who had received training in logic. 

, The lowest score for the Ph.D.’s was 23, te 
highest score 45, mean 32.7, and the standat 
deviation 7,4. There was far less variability 


Logical Reasoning: With and Without Training 


in scores in the Ph.D. Group by comparison 
with either Group WL or Group WOL. The 
mean score for the Ph.D’s is significantly 
higher at the 19% level than the mean score 
for Group WOL. The mean score for the 
Ph.D.’s is also higher than the mean score 
for Group WL, and the chances are 88 out 
of 100 that the difference is significant. 


Conclusions 


1. In the majority of cases, college grad- 
uates who have had at least three semester 
hours of college training in logic obtain higher 
scores on a test of logical reasoning than col- 
lege graduates who have not had courses in 
logic. 

2. Professor Miller's hypothesis that un- 
trained subjects cannot be expected to think 
logically is not substantiated, however, be- 
Cause: (a) 38% of the subjects who had had 
no training in logic obtained higher scores 
than their paired partners who had had col- 
lege training in logic; (b) subjects without 
college training in logic did 737 as well, as a 
group, on the test of logical reasoning as those 
who had had training in logic; and (c) college 
graduates with Ph.D. degrees who had not 
had college courses in logic obtained higher 
Scores, as a group, than college graduates with 
B.A., M.A., and LL.B. degrees who had had 
Courses in logic. , 

3. Professor Miller’s hypothesis might be 
restated to read, “In the majority of cases, 
Untrained subjects cannot be expected to 
think as logically as trained subjects. 


Further Implications 


There are two problems that need to be 
explored by further research: (1) Are students 
with facility in clear thinking the ones who 
are usually attracted to courses in logic’ 20 

2) To what extent is proficiency in logical 
reasoning the result of formal courses in logic 
rather than attributable to — —— 
as the subject’s native intelligence: 3 

We ho ee in other studies that mae 
on tests of logical reasoning always correlatec 
Positively, often substantially, and sometimes 
Very highly with scores on group tests o M 
telligence such as the Henmon-N elson, t i 

hurstone ACE, the Miller Analogies, and the 

erbal Intelligence Test. Other investigators 
such as Wilkins (12, p. 28), Burt (1, p. 237), 
and Sells (6, p. 23) have obtained similar re- 


401 


sults. We are, therefore, inclined to suggest 
that the ability to think logically is, to a cer- 
tain degree, an aspect of intelligence.* 

We believe that the potentiality for learn- 
ing to reason logically is dependent upon the 
native intelligence of the individual, and the 
rules by which logical reasoning is governed 
are learned in the daily experiences of life, 
sometimes in the classroom, with or without 
the benefit of instruction in formal logic. It 
would be desirable to find out what experi- 
ences and courses, other than formal courses 
in logic, increase the student’s proficiency in 
logical reasoning. It is our opinion that some 
courses, such as mathematics, even though 
not labeled as courses in logic, may have con- 
siderable “carry-over” value to logical rea- 
soning. 

Received December 31, 1952. 
References 


. Burt, C. L. Mental and scholastic tests. Lon- 
don: P. S. King & Son Co., 1921. 

2, Eidens, H. Experimentelle untersuchungen über 
den denkverlauf bei unmittelbaren folgerungen. 
Arch. f. d. ges. Psychol., 1929, 71, 1-66. 

3. Johnson, H. M. If-then relations in paralogics. 
Psychol. Rev., 1944, 51, 69-75. 

4. Miller, G. A. Speech and language. Chapter 21, 
pp. 789-810, in Stevens, S. S., Handbook of 
experimental psychology. New York: John 
Wiley & Sons, 1951. 

. Morgan, W. J. and Morgan, A. B. The Morgan 
Test of Logical Reasoning, To be published 

about June 1954 by Aptitude Associates, Merri- 
field, Virginia. 

6. Sells, S. B. The atmosphere effect. Arch. of 
Psychol., 1936, 200, 1-72. 

_ Stérring, G. Experimentelle untersuchungen über 
einfache sehlussprozesse. Arch. f. d. ges. Psy- 
chol., 1908, 11, 1-127. 

8. Stérring, G. Psychologie der disjunktiven und 

hypothetischen urteile und schlusse. Arch. f. 
d. ges. Psychol., 1925, 54, 23-84. 

9, Stérring, G. Psychologie der zweiten und dritten 
schlussfigur und allgemeine gesctzmissigkeiten 
der schlussprozessen. Arch. f. d. ges. Psychol., 
1926, 55, 47-110. 

10. Warren, H. C. Dictionary of psychology. 
ton: Houghton Mifflin Co., 1934. 

11. White, E. E. A study of the possibility of im- 
proving habits in thought in school children 
by a training in logic. Brit. J. educ. Psychol., 
1936, 6, 267-273. 

12, Wilkins, M. C. The effect of changed material 

on ability to do formal syllogistic reasoning. 

Arch. of Psychol., 1928, 102, 1-83. 


n 


~ 


Bos- 


2It would be desirable to cross-validate this study, 
with special care devoted to matching WL and WOL 
Groups of college graduates on the basis of measured 
intelligence. 


E vc GY 
THE JOURNAL OF APPLIED PsycHoo 
Vol. 37, No. 5, 1953 


The Effect on Recall of Changing the Position of a Radio 
Advertisement * 


William A. Belson 
Birkenbeck College, London, England 


This inquiry was aimed at establishing the 
relative rates of recall, under normal listening 
conditions, of a given advertisement placed 
(a) at the beginning of a program (beginning 
advertisement) and (b) in the middle of a 
program (interruption advertisement). It 
was, in effect, an inquiry into the relative rates 
of recall of a beginning advertisement (B) 
and an interruption advertisement (I). This 
comparison was made in respect of normal 
(N) or at-home listening conditions and not 
in respect of the situation in which people 
listen with the intention of learning. What- 
ever the relative merits of the two adver- 
tisements in the latter Situation, there are 
plausible grounds for theorizing that under N 
listening conditions various subjective evoca- 
tions such as hostility, inattention and certain 
defense mechanisms tend to enter effectively 
into the perception processes to the special 
detriment of I. 

The grounds for this theory are two-fold. 
In the first place, the interruption type of ad- 
vertisement emerged, in a Preceding survey in 
Sydney, as the most disliked form of radio 
advertising and, indeed, as a source of no 
little hostility (2). Second, the work of Bart- 
lett, Levine and Murphy, Rapaport and others 
(1, 7, 9, 10) has indicated the importance to 
perception of affective tendencies, partisan- 
ship and various personal factors, 

While, however, a superiority of B over I 
would conform to the theory, such a compari- 
son would not provide a crucial test. Differ- 
ences in recall between B and I could, in fact, 
arise out of conditions other than ‘differential 


* While the inquiry is presented here in compara- 
tive isolation, it was in fact conducted as part of a 
wider investigation into the relation of attitude to re- 
call in radio advertising. Findings from this wider 
study will be introduced only where they contribute 
to the interpretation of the present results. The in- 
vestigation was carried out in Sydney, Australia, 
where commercial broadcasting predominates (2). 


The author is now studying for the Ph.D, at Birken- 
beck College. 


402 


subjective evocations. To provide a crucial 
test, it was necessary to examine differences 
in recall of B and I first where those subjec- 
tive evocations peculiar to the N situation 
were operative and secondly where they were 
eliminated. The latter situation required, in 
fact, the development of a learning set (L) 
in relation to B and I. 


Method 


Design. Four groups G1, G2, G3 and G4 werg 
matched according to intelligence, age, Sex, g 
cupation and general background. Two of thena 
G1 and G2, heard B and I (respectively) un t- 
N conditions. This meant that there was an a 
tempt to evoke in them N reactions (i.e, NB an 
NI). The other two matched groups, G3 E 
G4, heard B and I (respectively) after te, ur 
velopment in them of L. Each of the its 
groups was subsequently tested for recall of 
advertisement to which it had been ar 
Difference in recall (R) between G1 and G2 (iè. 
"NB — "NI) represents the advantage in reca 
one placement over the other. The full extent o 
the difference in recall which may be attribute T) 
gy a o reactions is equal to ("NB — 

5 —*LJ). ; 
__ It is conceivable, however, that such difference 
im recall as might occur could arise out 0 na 
Planned group differences in respect of persvide 
characteristics and testing conditions. To prol aie 
a check on this Possibility a control device 
incorporated into the design, Advertisemen h 
and I were carried by identical programs, t Be 
of course they were recorded on different WEY 
Sections of additional advertisement were These 
uced in equivalent Positions on each wire. T tro 
two additional sections constituted the con 150 
material on each wire and each group was we 
tested for recall of this control material. Titer: 
planned differences had not occurred, then di N 
ences in recall of control interest in either hhe- 
or the L situations should not be significant. 


, an FB" 
Pol this control device are presented in 1 
ure 1, 


Material. 
cordings of ¢ 
ment, progra 
lets. 


ire Ten 
Material used included two Neat 
ommercial programs, playback Sok 
m opinion sheets and question 
s A 0- 
Two Wire Recordings of Commercial ee 
grams. The advertisements were carried 


Effect on Recall of Changing Position of a Radio Advertisement 


403 


Beginning Placement of 
the Advertisement 


Interruption Placement of 
the Advertisement 


We have pleasure in presenting to you the 
story of “Sherry and Son,” brought to you 
by Raymonds, the makers of distinctive 
sweets. Have you tasted Crunch Block, the 
sweet with the special flavour? It’s made 
by Raymonds, the sweet makers of distinc- 
tion. Only the best ingredients go into it— 
honey, glucose and nuts—all of them ideal for 
sweets. Crunch Block is really worth tasting 
and is obtainable at the manufacturer's own 
store in the Royal Arcade and at confection- 
ers, grocers and milk-bars. 


Part 
Control 
Material 


Raymonds, the makers of Crunch Block, 
have developed a new process called Bubbling 
which makes the texture of the sweet fine 
and smooth. This product is vitamin packed 
and has special nutritive value and is manu- 
factured under strictly hygienic conditions. 
Crunch Block costs threepence and there is 


no shortage of supply. 


B 


“SHERRY AND SON” 
(ALL) 


ght to you by Ray- 
f Crunch Block, the 
on’t forget to ask 
al flavour. 


This program is brou. 
monds, the makers o! 
sweet of distinction. D j 
for it; you'll enjoy its speci 


Part 
Control 
Material 


Fic. 1. Text of the two advertisemen: 
Items on w: 


Program “Sherry and Son.” As shown in Fig- 


Ure 1, the advertisements fell into several parts, 


n each wi "as an opening advertisement 
ana ire there was Pon one wire, how- 
t had been 

Son.” This 


was the I nt, On the other wire the 
E ‘the advertisement was not 


“leq B. Hence it will be 

Pacement was concerned, 

. a preceding and 4 15, 

$ mmon, cenne a this additional material 

Which constituted the control. 
nd, the experimental ma 


rogram Opinion Sheet. o 
Sheet Which A ked for written opinions of 


We have pleasure in presenting to you the 
story of “Sherry and Son,” brought to you 
by Raymonds, the makers of distinctive 
sweets, Have you tasted Crunch Block, the 
sweet with the special flavour? It’s made 
by Raymonds, the sweet makers of distinc- 
tion, Only the best ingredients go into it— 
honey, glucose and nuts—all of them ideal for 
sweets. Crunch Block is really worth tasting 
and is obtainable at the manufacturer's own 
store in the Royal Arcade and at confection- 
ers, grocers and milk-bars. 


Part 
Control 
Material 


FIRST HALF OF 
“SHERRY AND SON” 


And now we briefly interrupt our story. 
Raymonds, the makers of Crunch Block, 
have developed a new process called Bub- 
bling which makes the texture of the sweet 
fine and smooth. This product is vitamin I 
packed and has special nutritive value and is 
manufactured under strictly hygienic condi- 
tions. Crunch Block costs threepence and 
there is no shortage of supply. 

pn 

SECOND HALF OF 
“SHERRY AND SON” 


This program is brought to you by Ray- 
monds, the makers of Crunch Block, the 
sweet of distinction. Don’t forget to ask 
for it; you'll enjoy its special flavour. 


Part 
Control 
Material 


ts and their positions relative to program and control material. 
hich recall was tested are italicized. 


of three programs. It was part of the tech- 
nique used in deceiving subjects into reacting 
normally to the advertisement. Details follow. 

Instructions. N. Situation. The wire carry- 
ing B was played to G1 and that carrying I to 
G2. Subjects were told that the purpose of the 
session was to get their opinions of these pro- 
grams and were provided with opinion sheets for 
this purpose. These programs were said to be 
taken direct from the library of one of the com- 
mercial radio stations and to be just as they 
would be if they were going on the air. The 
first of the programs, a collection of three vocal 
items called “Just for You,” was played with its 
advertisement; the playback machine was stopped 
and subjects were asked to write their candid 
opinions of the program. The purpose of this 


404 


William A. Belson 


Table 1 


Recall Scores* on Beginning and Interruption Advertisements 


<a saat 

inni i Significance 0! 

Interruption ne 

Ape t Advertisement __ Difference} 

ük f : : 
a Mean SD Mean SD CR T 
` Normal Reaction 3.42 2.86 1.88 2 17 a 0.032 

Learning Set 4.91 2.38 6.67 2.87 2.2 


* Score out of 16 marks. 
t Two tails of the distribution, 


was to facilitate the deception. When subjects 
had finished this, they heard “Sherry and Son” 
with its advertisement and subsequently wrote 
opinions of that program too. After this the 
question booklets were distributed: there had 
been no warning at all of this step. Subjects 
were asked to write down their feelings (like/dis- 
like) about the advertisement in “Sherry and 
Son” and about advertising in general. They 
were then given recall tests by specific question 2 
and multiple choice methods. 

L. Situation. The wire carrying B was played 
to G3 and that carrying I to G4, Subjects were 
told that while they would be asked for opinions 
of the programs, their main job was to listen to 
the advertisement in “Sherry and Son” and that 
they would be required to recall it at the end of 


the program. They were asked to concentrate 
on remembering the advertis 


programs were asked for and 
made as with G1 and G2, 


Scoring. Items on which scores 
are those italicized in Figure 1. Marks were 
given for each correct reproduction, one on the 
specific question system and one on the multiple 
choice system, making a total of 16 marks on 


were based 


1 What was the name of the product being ad- 
vertised? What were the contents—the ingredients— 
of the product? I mean what did it have in it? 


the 8 items included in the B/I material ae 
possible of 28 marks on the 14 items ingi ethe 
the control material. Marks for the recall ude 
names of producer and product were ae to 
from totals because these items were comm 

the B/I and the control material. 


Results 


From Tables 1, 2, and 3 it will be seen - 
in the N situation recall of I is very i) 
nificantly less tkan recall of B (P = 0 y 
whereas in the L situation, recall of I is Ve 
significantly greater than recall of B ig- 
= .032). This represents a large at in 
nificant reversal of the advantage of 
going from the L to the N situation e 
= .002). Expressed in terms of percentab 
of recall, B was recalled in the N situa” ) 
about twice as well as I (21% vs. 12% 
whereas in the L situation B was recalled ©! 
three quarters as well as I (31% vs. eea 
_ This phenomenon does not, for the mr 
Ing reasons, appear to be an artifact ar! p 
out of unplanned differences between 8t° t- 
in respect of personal characteristics ot 
iag conditions, First, recall of control mw 
terial occurring with B and I, respectlV 


ma- 


Table 2 


Recall Scores* on Control \ 


Taterial Occurrin with th zinni 
Interruption Advertisements, Respecta ts and 
Beginning x + nificance of 
ae Advertisement} Aaertuption Signiticrence 

Conditions of ey sement} See 
Exposure Mean SD Mean SD “CR aa 
Normal Reaction 7.92 2.90 a %6 : 5 oa 

Learning Set 10.75 4.76 11.42 r oe 0.5 

$ “ Loe 


* Score out of 28 marks. 


f Control material occurring with the experimental placement 


| 
| 


- pea 


Efject on Recall of Changing Position of a Radio Advertisement 405 


Table 3 
Characteristics of Matched Groups* 
Intelligencet Age 
a ee OO Ngee =e Size of 
Group Mean SD Mean SD Sex Group 
G1 5.04 0.81 26.40 2.26 male 31 
G2 5.30 0.71 25.10 2.00 male AL 
G3 5.09 0.87 25.17 244 male 24 
G4 5.30 0.82 25.00 2.60 male 24 


* The occupation and background of the four groups were the same: trainee carpenters under the Common- 
wealth Reconstruction Training Scheme; ex-servicemen (non-commissioned) recruited from semi- or unskilled 


occupations. 
T In standard scores. 


does not differ significantly in either the N 
(P= 64) or the L (P=.57) situations, 
While the slight advantage in the L situa- 
tion of control material occurring with B is 
Maintained in the N situation (P =.79). 
Secondly, the four groups appeared to be well 
Matched and test conditions did not notice- 
ably deviate from plan. 

Moreover, this reversal phenomenon was 
repeated with each of the 8 items in B/I on 
which recall tests were made. 


Discussion 


It is not difficult to theorize about the 
Causes of the relative disadvantage of the I 
Placement, A theory of inhibition through 
hostility would not only have a certain plausi- 
bility, but would also be supported by the 
fact, already reported, that in an accompany- 
ing survey in Sydney (2), the interruption 
Placement emerged as the most disliked eh 
of radio advertising. Some caution is needed, 

OWever, for it was also found (2) a 
balized attitude (in terms of like/disli e), 
Whatever its concomitant organic pegami 
might be, was not correlated with reca i q 
=+.01+.11 with intelligence partialle 
out), Under the circumstances, there z Pi 
case for ting the existence - 
“ralized dupe mechanism of an moin y 
Ype—a theory which gains: additiona ai 3 
Port from a further finding (2) of vay ii F 
Or no correlation between alleged ea G 
attention to the advertisement and les 
=~ 14.11 with intelligence pE E 
Sut). Quite clearly; however; ed int 
"zing and research are required at this port- 


A second interpretative point must be made. 
The present use of “captive audiences” leaves 
out of account certain aspects of the real home 
listening situation: people listening at home 
are usually free, during the broadcast of an 
advertisement, to walk about, talk or turn 
the set off. In fact, one of the claimed ad- 
vantages of the interruption placement of an 
advertisement is that people are less likely to 
walk about, tune out, etc., during the middle 
of a program than at the beginning. The 
present study was not, however, directly con- 
cerned with this issue, although the issue is 
one on which research might well be con- 
ducted. 


Summary and Conclusions 


The prime purpose of this inquiry was to 
compare, under normal listening conditions, 
rates of recall of an advertisement placed at 
the beginning and in the middle of a program. 
There was, in fact, a good case for theorizing 
that reactions such as hostility, inattention 
and defense mechanisms would normally enter 
the perception processes to the special detri- 
ment of the interruption type of advertise- 
ment. 

While a superiority in recall of the begin- 
ning advertisement would concur with this 
theory of differential reaction, a crucial test 
would also require a comparison of recall rates 
when such reactions were eliminated. Hence 
the full investigation involved a comparison 
of recall of the two advertisements after (a) 
normal reaction and (b) the establishment of 
a learning set. 


406 William A. Belson 


Four matched groups were exposed in pairs 
under conditions (a) and (b), respectively, 
to a specially designed advertisement, one of 
each pair hearing the beginning placement and 
the others the interruption placement. A 
fairly elaborate administrative procedure was 
used to evoke “normal” reactions to the ad- 
vertisements. A control device was designed 
to detect differences in recall arising out of 
unplanned group differences in respect of per- 
sonal characteristics and testing conditions. 

Results showed that normal reaction to the 
advertisement in the interruption placement 
interfered with perception much more than 
did normal reaction to it in the beginning 
placement. The difference was, in fact, such 
that the very significant advantage of the in- 
terruption advertisement under learning set 
conditions of exposure was reversed, and very 
significantly so, when normal reactions took 
place. 


Received November 24, 1952, 


3. Chein, I. 


5. Droba, D. D. 


8. McNemar, 


References 


1. Bartlett, F. C. Remembering: a study in experi- 
mental and social psychology. Cambridge, 
England: Cambridge University Press, ieee 

2. Belson, W. A. The relation of attitude to per- 
ception and recall in radio advertising. La 
published Bachelor’s thesis, Univ. Sydney, 1949. 

Behavior theory and behavior of ayes 

tudes: some critical comments. Psychol. Rev., 

1948, 55, 175-188, 


4. Doob, L. W. The behavior of attitudes. 5)- 


chol. Rev., 1947, 54, 135-156. ‘ 
The nature of attitude. J. s06 
Psychol., 1933, 4, 444-463. 


6. Garrett, H. E. Statistics in psychology and edu- 


cation. (3rd Ed.) New York: Longmans, 
Green and Co., 1947, 


7. Levine, J. M. and Murphy, G. The learning and 


forgetting of controversial material. J. @ 
norm. soc. Psychol., 1943, 38, 507-517. z 
Q. Opinion-attitude methodology- 
Psychol. Bull., 1946, 43, 289-374. 


schol. 
9. Rapaport, D. Emotions and memory. Psycho 


Rev., 1943, 50, 234-243. 


: he 
10. Selleman, V. The influence of attitude upon t 


remembering of pictorial material. Arch. Psy- 
chol., N. Y., 1940, 258, 1-63. 


> 


—, m S 


Tue JOURNAL or APPLIED PSYCHOLOGY 
Vol. 37, No. 5, 1953 


Check-Reading as a Function of Pointer Symmetry and Uniform 
Alignment * 


Keith W. Johnsgard 
The State College of Washington 


During the past several years an extensive 
program of psychological research has been 
carried on dealing with human reactions to 
aircraft instrument panels. This program, 
undertaken primarily by the United States Air 
Force, is an attempt to simplify the increas- 
ingly complex task of reading aircraft instru- 
ments under flight conditions. ; 

Aircraft instruments serve for three basic 
types of reading: (1) check-reading for as- 
Surance of a normal indication; (2) qualita- 
tive reading for the meaning of a deviation; 
and (3) quantitative reading for the actual 
numerical value of an indication (5). This 
Paper is concerned with the first type and em- 
ploys the rotating pointer type indicator which 
has been shown to facilitate short latency re- 
Sponses with a minimum of errors (3). 

A recent study has indicated that rectangu- 
lar arrangement of small engine instruments 
on multi-engine aircraft and the use of rotata- 
ble dials, making possible uniform pointer 
alignment under flight conditions, will provide 
a significant advantage in speed and accuracy 
of check-reading (6). However; another proj- 
€ct concerning recognition span made it ap- 
Parent that check-reading is facilitated by 
Pointer symmetry even when the pointers are 
not all in the same standard position such as 
9 o'clock (8). Both uniform pointer align- 
ment and pointer symmetry are or m 
mixed alignment for check-reading. i e 
development of rotatable dials makes t i 
Principles applicable to engine D 
Panels. With this arrangement dials cou i 

xed in such a manner that the pointers me 
form any pattern that would facilitate an 
and accurate check-reading wherein any e- 
Viation could be quickly identified in terms 


of direction, engine, and function. 
uire- 
* Submitted i tial fulfillment of the require- 
ments for the M S” degree in the Dep animent a a 
chology in the Graduate School of the Unive Dr. 
North Dakota. The author wishes to. n the re- 
ermann F, Buegel under whose directio: 


Se 
arch was conducted. 


A basic problem then is a consideration of 
which type of pointer-pattern would facilitate 
the most efficient check-reading. It is this 
problem with which this paper is concerned. 


Method 


Stimulus Preparation. Four sixteen-dial panels 
containing different pointer patterns were used 
and will be referred to as configurations through- 
out the remainder of this report. A null hy- 
pothesis was stated that the four configurations 
would equally facilitate check-reading. The con- 
figurations are shown with pointers in a null po- 
sition in Figure 1. 

Nineteen stimulus panels were prepared for 
each configuration. One panel showed the point- 
ers in a null position with the remaining eighteen 
panels for a particular configuration containing 
dials in which pointers were deviating from null 
position. These eighteen panels were split into 
six blocks of three panels each. Panels of the 
first block each contained one deviating pointer, 
panels of the second containing two, and so on 
with each panel of the sixth block containing six 
deviating pointers. A complete set of eighteen 
test panels for any one configuration contained a 
total of 63 deviating pointers. The dial or dials 


0009 OOOO 


QOQO 


CONFIGURATION 2. CONFIGURATION 4, 


The four configurations with pointers in a 
normal or correct position. 


Fic. 1. 


407 


408 Keith W 


within a particular panel that were to contain 
deviating pointers were chosen in a random man- 
ner (4). The position of the deviating pointer 
within an error dial was chosen in the same way 
requiring a possible discrimination ranging from 
a maximum of 180 degrees to a minimum of 15 
degrees from the null pointer position. 

Stimulus material was prepared from 35 mm. 
negative film with projected dial borders and 
pointers appearing white on a dark background 
when flashed on a screen (2, 11). The diameter 
of a projected dial was three inches (7, 9, 10, 
13). Pointer width was approximately 3/32 inch 
extending from the center of the dial to the 
border (1). 

Physical Conditions. A modification of the 
Whipple pendulum type tachistoscope was used. 
The noiseless mechanism is described and pic- 
tured in a recent publication (11). Stimulus ma- 
terials were contained in a slide projector behind 
the tachistoscope. The projector was 34 inches 
from the floor with the light beam projected 
horizontally to a screen seven feet away. The 
illuminated screen when using transparent slides 
was approximately eight foot-lamberts (2). 

Two S’s were tested simultaneously, and were 
seated on each side of the projected beam in two 
single-arm writing desks pointed directly at the 
center of the projected panel. The distance from 
the projection area to the S’s eyes was approxi- 
mately 50 inches (1). 

The testing room was darkened to maintain ef- 
fective contrast (14). Directly over the subjects 
a soft beam of light was directed downward to 
facilitate writing responses on score sheets. The 
beam did not affect screen brightness. 

An exposure time of a half second was allowed. 
This is the average fixation time for pilots en- 
gaged in instrument flying (8), 

The Sample. The sample population consisted 
of 48 male students enrolled at the University of 
North Dakota. The ages ranged from 18 through 
37, with all but 7 of the S’s between the ages of 
18 through 26. Visual acuity was checked before 
experimentation with a Snellen Eye Chart. A 
criterion of 20/25 was set as a minimum of visual 
acuity. Subjects with sight Corrected to this 
criterion by glasses were considered satisfactory 
for experimental Purposes. None of the S's had 
oo any tachistoscopic training of any 

Test Procedure. After being checked f 


sight, the two S’s were se viene 


ated, acquainted with 


. Johnsgard 


sition. S’s were instructed to check the appro- 
priate dial or dials on the response sheets that 
corresponded to those on the test panel aa 
ing deviating pointers. Presentation of me j 
test panels for the appropriate configuration tol- 
lowed. Before an individual panel presentation 
the S was informed of the panel number and was 
given a ready signal. Approximately 1 gen 
later the exposure occurred. After each exposur 
as much time as was needed was allowed for re- 
sponding. Is 

Following observation of two sets of test pane 
a short rest was allowed. The entire test perto 
varied from 35 to 45 minutes depending on spee 
of response. 

In i attempt to eliminate practice effect ira 
the total group results the order in which t E 
four tests were administered was varied. On 
group of 12 S’s observed the ‘configurations in 
numerical order. A second group began W! S 
configuration 2, a third group with configuration 
3, and a fourth with configuration 4. S’s for eac 
group were selected at random. d for 

Method of Scoring. One point was allowe 5 
each dial correctly identified as containing a 
deviating pointer. Four total configuration aa 
were computed for each paper. Each of aa 
total scores was the sum of six sub-scores. a 
sub-scores were the correct responses made 5 
each of the six sets of three panels that conten 
from one to six deviating pointers. It was a 
considered necessary to penalize incorrect 
sponses. The method was arbitrary. 


Results 


Means of total correct responses in locating 
error dials in the entire set of 18 slides id 
each configuration are listed in Table 1- a- 
test the null hypothesis that the configu" 
tions were of equal difficulty, a small samp ` 
# test for correlated means was compute 
The results of this test are indicated in Ta be 
2. The test showed configuration 3 tO for 
significantly superior to the others tested sis 
check-reading. The stated null hypothesi 
might safely be rejected. Differences betwe 


Table 1 


errors 
Means, Standard Deviations, and Standard Erra 
E} for Total Correct Responses 


Con figu- f 
ration Mean s ha 
1 30.19 6.80 a 
2 31.46 5.72 0 
3 34.48 4.85 "36 

4 17.54 5.94 ; 


l 


' 


Check-Reading as a Function of Pointer Symmetry 


Table 2 


Confidence Levels and ¢-Scores Between Total 
Correct Response Means 


Configuration Confidence 


Means Compared l Level 
C1-C2 1.61 .120 
C1-C3 4.94 Beyond .001 
C1-C4 12.82 Beyond .001 
C2-C3 3.92 Beyond .001 
C2-C4 13.5, Beyond .001 
C3-C4 18.35 Beyond .001 


the standard deviations proved to be insig- 
nificant. 

The performance curves for the 48 S’s on 
each configuration are shown in Figure 2. 
The curves are best fit by a method of least 
Squares and in all cases the fits were very rea- 
sonable, It should be stated that a definite 
restriction exists for the interpretation of these 
data. The data are plotted as mean correct 
responses as a function of blocks of three 
trials. However, an experimental maximum is 
imposed by design, since only one error dial 


° 


a 


MEAN CORRECT RESPONSES 


dials located 


n error = 
Fi, 2. Mea presenting t 


abscissa point re 


3 
TRIALS 


in each of the four configurations with each 
he mean values of blocks of three trials. 


409 


is contained in each panel of the first block, 
two in the second block, and so on with six in 
the last block. This factor could have in- 
fluenced the first two blocks of trials but per- 
haps affected no further blocks. With this 
restriction in mind, the data were plotted, 
as curve shape was considered to be important 
with regard to further practice. Examina- 
tion of the formula for configuration 2 is sig- 
nificant in that the suggested asymptote is 
3.14, while those for configurations 1 and 3 
are 2.26 and 2.64, respectively. There exists 
the possibility that with further practice, con- 
figuration 2 might prove to be more useful 
than any of those tested. 

It will be recalled that the order of con- 
figuration presentation was varied in order to 
compensate for possible over-all practice effect. 
With regard to this an analysis of total con- 
figuration scores was made. The mean score 
for the 48 Ss on the first configuration pre- 
sented was 27.42, with the mean score for the 
last configuration presented being 29.56. It 
is evident that some transfer exists between 
configurations. 


€)=(R22.26-2.896e--256! T) 
Gp(R=314-2.67e~-885T) 
C3=(R=2.64-268e~2058T) 
C4=(R=19T+.31) (aio) 


4 


5 


410 


Grether and Warrick (6) report that about 
twice as much time was required to check- 
read a sixteen-dial panel employing four sub- 
groups than to check-read a panel of equal 
dial number with pointers in uniform align- 
ment at the nine o’clock position. This study 
tends to reinforce that finding in that about 
twice as many error dials were found in con- 
figuration 1 with uniform alignment at nine 
o’clock as in configuration 4 which employed 
the four sub-groups. These same investiga- 
tors have shown that the nine o’clock po- 
sition is the most favorable pointer position 
of uniform alignment. This experiment has 
indicated that panels employing pointer sym- 
metry are equally as good as the most favor- 
able position of uniform alignment for check- 
reading. The results further Suggest with 
reasonable assurance, that one of the con- 
figurations (C3) with pointer symmetry is su- 
perior to uniform alignment (C1) and that 
another (C2) might prove superior with prac- 
tice. 

The findings of this experiment would in- 
dicate that rotatable dials and a sixteen-dial 
rectangular panel would facilitate check-read- 
ing of aircraft engine instruments. It is likely 
that these principles can be applied in most 
situations where a multi-engine arrangement 
exists such as industry where rapid accurate 
check-reading is a necessity. 


Summary 


A tachistoscopic study in which simulated 
instrument dials were observed at short ex- 
posure was performed to determine efficiency 
in locating deviating dial pointers in four in- 
strument panels employing the principles of 
uniform alignment, pointer symmetry, and 
sub-grouping of pointer pattern. A null hy- 
pothesis was stated that the four patterns 
would equally well facilitate check-reading, 
S’s totalled 48 naive male students. 

The results of the experimenta 
statement of the follow 
sions: 

1. Configurations in thi 
ploying pointer symmetry 
reading equally as well as do 
form alignment. There is rea: 


tion allow a 
ing tentative conclu- 


S experiment em- 
facilitate check- 
panels with uni- 
sonable evidence 


Keith W. Johnsgard 


that one of the former type is superior to the 
latter in terms of number of correct responses, 
and that another might prove superior with 
practice. The null hypothesis was rejected. 

2. Panels employing pointer symmetry and 
uniform alignment are superior to sub-groups 
for check-reading. . 

3. Check-reading improves with a relatively 
short amount of practice and some transfer 
exists between panels with differing pointer 
positions. 

4. It was suggested that the use of a rec 
tangular sixteen-dial panel of aircraft engine 
instruments with rotatable dials would facili- 
tate rapid check-reading, and that these prin- 
ciples might profitably be applied in oaa 
situations were multi-engine panels are used- 


Received January 12, 1953. 


References 


1. Armed Forces-NRC Vision Committee. | <_< 
ards to be employed in research on visual, an 
plays. Ann Arbor: University of Michigan 
1947, z uf 

2. Chalmers, E. L., Goldstein, M., and Kapp ii 
W.E. The effect of illumination on dial ”‘ h 
ing. U. S. Air Force Air Materiel Command, 
A. F. Tech. Report No. 6021, 1950. -Sape 

3. Connell, S. C. Some variables affecting AR ne 
ment check reading. U. S. Air Force Ait 024, 
teriel Command, A. F, Tech, Report No. § 
August 1950, ho- 

4. Edwards, A. L. Experimental design in p i 
logical research, Rhinehart and Comp? 

Inc., New York, 1950, pp. 23-24. sus 

- Grether, W, F, Discussion of pictorial Vg 

Symbolic aircraft instrument displays. ab. 
AAF AMC, Engng. Div., Aero Med. Jassi- 
TSEAA-694-8B, August 4, 1947, (Undla 

fied, English) fect 

a Grether, W. F. and Warrick, M. J. The S o 

of pointer alignment on check-reading Air 
engine instrument panels. U. S. Army ndi 
Force Headquarters, Air Materiel Com No. 
Engineering Division, Memo Report 
MCREXD-694-17, 1948, speed 
- Grether, W, F, and Williams, Jr., A- C ion of 
and accuracy of dial reading as a ome j- 
dial diameter and angular spacing of sca gical 
visions. In P, M, Fitts (Ed.), Ps chi pgtom 
research on equipment design. fre, 1947 
D. C.: U. S. Government Printing Offices ading 
- Johnson, T, G, Recognition span and 
Patterns in simulated instrument na iver- 
tions. Unpublished Master's Thesis, 


~ 


Check-Reading as a Function 


sity of North Dakota, Grand Forks, N. Dak., 
1950. 

9. Kappauf, W. E. and Smith, W. M. Design of 
instrument dials for maximum legibility: II. 
A preliminary experiment on dial size and 
graduation. USAF Air Materiel Command 
Memorandum Report MCREXD-694-1N, 1948. 

10. Kappauf, W. E., Smith, W. M., and Bray, C. W. 
Design of instrument dials for maximum legi- 
bility: 11. Development of methodology and 
some preliminary results, USAF Air Materiel 
Command Memorandum Report MCREXD- 
694-11, 1947. 


ii. 


12. 


13. 


14. 


of Pointer Symmetry 411 


Stevens, S. S. Handbook of experimental psy- 
chology. John Wiley and Sons, Inc., New 
York, 1951, p. 1301. 

Thurstone, L. L.’ A factorial study of percep- 
tion. Chicago: U. of Chicago Press, 1944, pp. 
36-37. 

White, W. J. The efect of dial diameter on 
ocular movements and speed and accuracy oj 
check-reading groups of simulated engine in- 
struments. USAF Air Materiel Command 
Technical Report 5826, 1949. 

Woodworth, R. S. Experimental psychology. 
New York: Henry Holt and Co., 1938, p. 688. 


Tue JOURNAL or APPLIED PSYCHOLOGY 
Vol. 37, No. 5, 1953 


Visual Performance as a Function of Low Photopic Brightness 
Levels * 


Milton L. Rock 
E. N. Hay & Associates, Inc., Philadelphia 


A systematic investigation of performance 
in visual tasks as a function of low photopic 
brightness levels is essential to expand our 
knowledge of adequate visual performance 
levels. Although there have been many stud- 
ies under higher brightness levels (above 1 
foot-lambert) the range between cone thresh- 
old and 1 foot-lambert has been relatively 
neglected. Stimulated by the needs of the 
last world war, interest in this region has been 
increasing and the time seems now appropriate 
for a systematic summary of information in 
this area. When new experiments are added 
to fill the gaps, this should give us a more 
nearly complete theoretical and practical 
knowledge of the problem. Senders’ (27) 
summary and Rock’s (25) annotated bibli- 
ography (sections B and C) concerning stud- 
ies of visual acuity are comprehensive and 
should be consulted for a background in this 
general area. In this discussion visual acuity 
per se will not be considered a measure of 
performance and in most studies reported was 
a controlled variable. 

A number of variables have been shown to 
be of importance in the investigation of visual 
performance. For example, defects of both 
the visual mechanism and the stimulus objects 
must be controlled. Ferree and Rand (11, 
12) and Sheard (28) have indicated that 
ocular defects, presbyopia in their 
quire increased illumination for adequate per- 
formance. Tinker’s (37) study on illegible 
print also indicates a need for higher bright- 


Cases, re- 


ane experiments reported here 
part of a program of research on human f: 
lated to aircraft instrument lighting pupil Ot a 
a research contract (W33-038 ac18317) between the 
University of Rochester and the Air Materiel Co; . 
mand, U. S. Air Forces, They have been reported in 
we ae Sea ene to the Aero Medical 
aboratory of the Air Materiel C i 
694-21 and TR 6040, P GREDE 


were conducted as 


ness levels required with this defective stimu- 
lus object. Tinker (36, 38) has shown that 
adaptation level of the eye has an enormous 
effect on performance and also on subjects 
brightness level preferences. Investigations 
of the effect on visual performance of the 
quality of light have given conflicting results. 
Ferguson and McKellar (9), investigating 
binocular visual observations of a landolt 


‘ring at brightness levels below 1 foot-lambert, 


found that at 0.5 foot-lamberts the best Pe 
formance was with red light, then amber, 
white, blue and the poorest, green. Many 
other studies such as that of Craik (7), w 
investigated legibility of different colored a 
strument markings at low illumination, n 
found green and blue to be inferior. Parin 
(15) investigated the relative merits of lig 
of different wave-lengths in the airplane CoC z 
pit situation and found the measure of ™ 
dividual thresholds to be a good indas f 
visual function at low intensity levels. 7 it 
Farland (22) recommends that for a 
use at night, no wave-length below 620 E 
(red-orange) be used. Some studies have " e 
ported on the effectiveness of performance” 
under various wave-lengths of light. spini 
and Rock (30) investigating performance 5 
reading airplane dials under four dka 
wave-lengths and at two illumination lepin 
(.01 and 0.1 foot-lamberts) found that w dial 
the range of colors and brightness studied C 
reading performance showed no consisten 
lationship to the wave-length composition. s 
illuminant. These results have been Ver? 
with performance of flying a link Pi 
under these same conditions, hich 
, The types of performance criterion W jn 
investigators have used fall into three 
classes: (a) speed of response, (b) ac° 
of response, (c) physiological co ision 
Cobb (5, 6) has investigated speed of em 
as a function of illumination level. be con” 
Ployed various patterns associated wit found 
fusion patterns as stimulus objects and 


jne" 


412 


_ 


— 


| 


Visual Performance and Low Photopic Brightness Levels 413 


that a logarithmic relationship held for paral- 
lel bars between 1 and 100 foot-candles, but 
under more complicated conditions the rela- 
tionship breaks down so that the expected 
gain in sensitivity due to increased intensity is 
not realized. Ferree and Rand (10) investi- 
gating speed of vision as a function of bright- 
ness with special reference to industrial 
situations found that on work of a factory 
type, involving important use of the eyes, 
speed of vision increased as brightness in- 
creased up to a maximum. Many studies of 
reading performance have used speed as a 
Criterion; Tinker (32, 33, 35). 

Accuracy, usually in conjunction with 
Speed, has been used extensively in practical 
Situations as a measure of performance. 
Typical studies are those of White, Britten, 
Ives, and Thompson’s (41) study of ocular 
efficiency and fatigue among letter separators 
as a function of brightness level, and Weston 
and Taylor's (40) investigation of fine type- 
Setting done by hand as a function of bright- 
ness level, Most of the threshold studies, 
however, are accuracy studies which do not 
involve speed of reaction, e.g., Hartline (15), 
Brown (3), Brown and Mize (4), Graham 
and Hunter (14), etc. 

Luckiesh and Moss (17, 18, 19) employed 
Such physiological correlates as blink rate, 
heart and pulse rate, metabolic ratios, etc., 
as performance criteria for readability. Mc- 
Farland, Knehr and Berens (23) found that 
Metabolic ratio and pulse rate are inadequate 
criteria for reading. Tinker’s (34) numerous 
experiments throw doubt on the use of blink 
Fate as a criterion. eer 

From his consideration of the existing liter- 
ature, the writer believes that in future ex- 
Periments on visual performance: (a) visually 
Screened subjects should be used so that they 
all into a “normal” category; (b) the stimu- 
us objects should be legible and above the 
tesolution threshold of the eye at all bright- 
Nesses tested; (c) light quality should be 
controlled and specified; (d) light cael sy 
Should preferably be designated as sae 
ness; and (e) performance criteria should be 
Speed and Jor accuracy with the possible ae 
of certain physiological correlates in the cas 

fatigue studies. 


n consideration of the above, and with the 


aim of contributing experimental data to fill 
the gaps in our knowledge of visual perform- 
ance as a function of low brightness levels, 
four representative visual tasks were chosen 
for investigation: (1) judgment of magnitude 
of an illusion; (2) motion threshold; (3) 
depth perception; and (4) a simple addition 
task. Each task was investigated under five 
brightness levels in the crucial range of .005 
foot-lamberts (which is just slightly above the 
values usually stated as cone threshold, i.e., 
.002 to .004 F.-L.) to 1.00 foot-lambert. 


Experiment I. Magnitude of the 
Miiller-Lyer Effect 


This first experiment is a study relating 
performance in the judgments of equality of 
the two lines of the Miiller-Lyer figure to 
various low photopic levels of brightness. 
The Miiller-Lyer, one of the best known visual 
geometric illusions, has many variations. 
The basic example is a figure consisting of 
two straight parallel lines of equal length. 
Each line is terminated at each end by two 
short oblique lines forming an angle whose 
apex is at the end of the major line. On one 
line the oblique lines extend back toward 
the center of the major line, on the other they 
extend away. The illusion, which is a po- 
tent one, consists in perceiving the latter line 
as longer than the former. 

This illusion was selected as one of the 
visual tasks to be investigated here because it 
is representative of a general class of visual 
illusions and hence is a rather important 
visual perceptual task of judgment. 

Many variables affect the judgment of 
visual illusions. Our past experience, associa- 
tions, demands, desires, and more or less ob- 
scure influences may create illusions. The 
physical characteristics of the stimulus object 
are also of paramount importance. The loca- 
tion of the object in the visual field, the struc- 
ture of the field, equivocal figures, the influ- 
ence of angles, color, irradiation, and bright- 
ness contrast and lighting and shadows are 
just a few of the main variables causing illu- 
sions. 

Our present interest is to investigate the 
relation of the magnitude of effect of an il- 
lusory figure to low photopic brightness levels. 
We want to answer the following two ques- 


414 


tions: Do geometric illusions increase in effect 
under low brightness levels as compared with 
ordinary illumination? Is there a critical 
value of illumination above which the mag- 
nitude of the illusion does not vary with 
brightness? 


Method 


Apparatus. The general plan of the ap- 
paratus followed that employed by Spragg 
and Rock (29) in their studies of dial read- 
ing performance as related to low photopic 
illumination levels. 


The subject was seated in a three-sided booth, 
approximately 4 X 4 feet, facing the middle wall. 
The entire visual field was painted a matte black. 
Placed in the 14 X 11 inch aperture of the front 
wall was the Miiller-Lyer figure. The center of 
the figure was 28 inches from the subject’s eyes 
and 15° below his horizontal line of regard. An 
adjustable head rest, mounted on a horizontal 
bar, served to keep the subject's head in a satis- 
factorily constant and comfortable position. 

The stimulus object was a Miiller-Lyer figure, 
with a 7.5 cm. stationary standard arrow-headed 
part. On the subject’s right side was a sliding 
board on which there was a line with an arrow 
feather at one end. This side of the figure could 
be moved under the standard until its line was 
the desired length. The lines were 2 mm. in 
width, white on a black background. The ob- 
liques were 3 cm. and at a 27° angle. The vari- 
able stimulus was moved by the experimenter by 
means of a rack and pinion which could be varied 
by equal steps in a smooth manner. 

The light sources were two 60 w. Mazda lamps 
in cans fitted with filters and aperture holders. 
An assembly consisted of a ground-glass square 
of heat-resistant glass, and a brass plate with a 
circular aperture drilled in the center. Voltage 
was maintained at a constant level by means of 
a variac, Model V-5MT, and a monitoring Wes- 
ton A.C. Voltmeter, Model 433. The color tem- 
perature was in the neighborhood of 2400° K 
Chosen levels of illumination were achieved by 
means of accurately drilled apertures in remoy- 
able brass plates. All light sources had two 
ground-glass surfaces in the optical pathway to 
aeiee high dispersion. 7 

ata sheets were prepared in advance. 
indicated the five levels of iiuna ot 
columns under each for recording the subject’s 
eames and a section for the subject’s com- 
ments. 


Subjects. Ten male subjects served as the ex- 
perimental population. All were students at the 
University of Rochester (three 


raduates 
seven undergraduates) and were afl in their Hs 
teens or twenties in age. > 


Subjects cho 

i A 5 sen wer 

those who passed a rigorous visual screening, 
; 


Milton L. Rock 


using the Keystone Telebinocular. All subjects 
had: normal ophthalmoscopy, 20/20 visual acuity, 
monocularly and binocularly, at distance and 
near, without glasses, 80 per cent or Beter 
stereopsis, no vertical imbalance, less than 
prism diopters physiological exophoria ; less than 
2 prism diopters of exophoria or esophoria at 
distance; and normal color vision. 

Procedure. Each subject was allowed to be- 
come cone dark adapted (approximately ten 
minutes) before the illumination was turned on, 
With the stimulus object illuminated with -05 
foot-lambert of brightness, the subject was given 
the instructions: X 

This is an experiment to determine the influ- 
ence of varying brightnesses of illumination on 
the perception of the length of a line complicate¢ 
so as to present an illusion. Yes, this is a vet 
common illusion, but I want you to tell me whe? 
this line (pointing to the variable) looks to he 
to be equal to this stationary line. (With the 
variable much larger than the standard.) Ron 
the line is much larger than the standard. za 
are to say “Now” when the lines appear tO ane 
to be equal in length. (Decrease the size of t n 
variable stimulus so that it is much smaller Lar 
the standard.) Now the line is smaller than 
standard. Say “Now” when it appears to Ye 
that the lines are equal. Five practice trials we 
given, ree 

On the formal trials each subject was first Pie 
sented with the variable stimulus larger than ex- 
standard. The stimulus was decreased by the ntil 
perimenter in successive steps of y mm. Ui. 
the subject reported equality. Then the oP jer 
menter presented to the subject a stimulus SMa y 
than the standard, and the stimulus was alter 
by successive increments until the subject oo, 
ported he no longer perceived any differen he 
This is a modified method of limits called "of 
method of equivalents, in which only points wo 
equality are recorded, approached from ihe mi- 
possible directions. Ten responses at cach illu ve 
nation level for each subject were require hat 
ascending and five descending, It is realize 
space errors are present in this method, form” 
We are interested in differences between per ince 
ance at the various levels of brightness and pould 
these errors are presumably constant, they § 
not bias the results. i o e 

The levels of illumination were chosen t ight- 
compass the critical range of low photopic bend 
hess as found in the experiment of Spraee, ove 
Rock (29). The levels range from just @ jue 
cone threshold to 1 foot-lambert. The ‘arts. 
ueg Were: 005, OL, 05, „1, and 1 notia 

Tightness measurements were mace p 
Macbeth Illuminometer used in the subject pied 
sition and directed against a white square Fas ob” 
ae the same paint as that of the stimt 

; d! 

Since five levels of illumination were pa of 
Was necessary to employ balanced sequen 


t a 


Visual Performance and Low Photopic Brightness Levels 


brightness levels to control possible practice and 
fatigue effects. In changing from one level to 
another the subjects were given from five to ten 
minutes for adaptation. 

Each subject was given a visual screening test 
on one day, and the entire series of judgments 
on another day. 


Results 


The data of this experiment consist of judg- 
ments of equality of length of lines in the 
Miiller-Lyer figure for ten subjects under five 
different brightness levels. 

The mean errors of judgment are presented 
in Table 1 which shows for each subject the 
Mean error, sigma, standard error, and per 
cent error of standard at each of the five 
brightness levels. 

Inspection of Table 1 and Figure 1 shows 
that: (a) the effect of the illusion at the three 
higher brightness levels is considerably less 
than at the two lower levels; (b) above 0.05 
foot-lamberts increasing brightness produces 
no significant improvement in performance; 
while (c) below 0.05 foot-lamberts decreas- 
ing brightness is clearly associated with poorer 
Performance on this task. ; 

Since our principal concern is with per- 
formance as a function of brightness, a £ 


Table 1 


Showing the Magnitude of Errors in mm. in Judgments 
of Equality of the Miiller-Lyer Figure 
at Five Brightness Levels 


Brightness in Foot-Lamberts 


Subjects 9.005 001 0.05 010 100. 
1 24. i 9 8 $ 
2 29 24 M i t 
a o ig s wW 8 
$ 37 gg i 2 2% 
5 io u | w n 
2 ne? g 2 m 
1 30 24 10 s 10 
3 å y b w g 
i i5 B 9 4 a 
m a is D o L 
Sum a3 no 142 133 138 
Mean 343 210 T 183 1.3 
. oct 0.73 048 On 0.63 
= 020 024 015 02 0.21 
j 17.7% 184% 


% of Stand, 32.5% 28.0% 18.9% 


415 


ACCURACY OF JUDGMENTS OF EQUALITY OF THE MULLER- 
LYER FIGURE AS A FUNCTION OF BRIGHTNESS 


250 
2 
z 200 
E 
2 
Z 
z 
= 150 
i 
z 
kA 
Pt 
= 
100 
"2 = o 
Log I, in Foot Lamberts 
Fic. 1. Visual performance as a function of low 


photopic brightness levels. 


analysis was carried out comparing group 
performance for each pair of brightness levels. 
A summary of this analysis is presented in 
Table 2. From Table 2 it is seen that all 
differences between brightness levels below 
.01 foot-lamberts and those above .01 foot- 
lamberts are significant at the 1 per cent level. 

On the basis of the data presented above, 
it seems clear that: (1) there is a larger error 
in perception of length as tested in the Müller- 
Lyer figure below the region of .05 foot- 
lamberts; and (2) little or no increase in 
performance results from increasing bright- 
ness above this level. 

Practice Effects. It will be recalled that 
each subject was given five practice trials be- 
fore formal trials were begun, in order to re- 
duce practice effects. The adaptation periods 
between levels would tend also to reduce 
practice effects. In order to determine 
whether practice effects were playing a sig- 
nificant role in this situation the errors were 
tabulated for each subject in terms of first 
brightness level tested, second level tested, 
etc. Since each brightness level appeared in 
each ordinal position the same number of 
times, no advantage due to sequence is pres- 
ent for any brightness level. 

The ż tests of the several differences in- 
dicate that there is no evidence of a practice 
effect. Early experimenters with illusions 
noticed that continued experience with one 
certain figure diminished the amount of the 
illusion. Heymans (42) and particularly 
Judd (42) made a systematic study of the 


416 Milton L. Rock 


Table 2 ` 


Values of ¢, Comparing Mean Magnitude of Errors in 
Judgments of Equality of the Miiller-Lyer 
Figure at Five Brightness Levels - 


Brightness in Foot-Lamberts 


.005 01 05 -10 1.00 
005 š 
.01 3.62** = = — — 
05 7.06** 642" — — — 
-10 LOSS Fapt 134 — — 
1.00 TOE 632 080 G77 — 


** Significant at 1 per cent level. 


practice effect and found that the illusion 
gradually diminished and approached zero. 
The practice effect held good only for the 
original position of the figure. Reversal of 
figures returned the illusion to full strength. 
The illusion was revived even in the original 
figure by standing off and looking at it casu- 
ally as a whole. In our experimental pro- 
cedure the effect of practice was reduced satis- 
factorily by the use of units of experimenta- 
tion containing small number of trials sepa- 
rated by adaptation periods. 
Although the experimental design was such 
as to minimize the effects of errors of habitua- 
tion and/or expectancy (by giving ascending 
and descending trials alternately and changing 
the length of the trials) the analysis of the 
ascending and descending series at èach 
brightness level shows the ascending series 
to be greater in all cases than the descending 
series. The differeneces were: .58 mm. at 
.005 foot-lamberts, .74 mm. at .01 foot-lam- 
berts, .96 mm. at .05 foot-lamberts, 1.06 mm. 
at .10 foot-lamberts, and .92 mm. at 1.00 foot- 
lamberts. These are errors of habituation. 
All subjects commented that judgments 
were more difficult to make under the two 
lower brightnesses but they all reported that 
they thought they did as well under the lower 
brightnesses as under the higher levels, 


Discussion 

The results reported above 

task a critical level of bright 

lamberts) below which the m 

ceptual errors increases signi 
errors at and above this 


indicate for this 
ness (about .05 
agnitude of per- 
ficantly from the 
brightness level. 


Above this critical level further increases in 
brightness up to 1 foot-lambert (and possibly 
indefinitely), produce no significant incre- 
ments of performance. It would seem as 
though once a subject has been given n 
ficient brightness to perform the task wit. 
ease, brightness is no longer a significant 
variable. 


Experiment II. Absolute Motion Threshold 


One of the first questions to be raised with 
reference to the amounts of the motion that 
are either just perceptible or just not an 
ceptible, is that of the so-called threshol 
value. In this study only the lower agony 
threshold is to be considered. It i ou 
that when velocity of a stimulus is diminishe h 
there is a critical level of velocity beneat 
which no motion is perceived. Perception “4 
motion not only depends on: (1) the vie 
velocity of the moving stimulus, but also re 
such variables as (2) form and size of t d 
stimulus; (3) presence or absence of ee 
reference objects, and their nature; (4) re 
solute and relative brightness of the ee 
and the background; (5) absolute and re 
tive color of stimulus and background; k 
light or dark adaptation of the eye; (7) ins 
nocular and binocular observation; (8) macu- 
lar or peripheral observation of the stimulus 
(9) distance of observation; (10) duratio 
of the observation period; (11) eyes allots 
to move or required to fixate; and (12) eho 
acteristics of the path of movement. ES 
usual operational conditions the observatio 
is binocular with macular and/or periph a 
fixation, non-limited duration of observati : 
period, and the eyes moving in normal He 

In the present experiment the varl@ e 
were treated in the following manner: ing 
relation of physical velocity of the mo 
stimulus to the absolute and relative brig 5 
ness of the stimulus and background the 
measured; absolute and relative color of 
stimulus and background, size and qe 
stimulus, adaptation of the eye, distan? the 
the observation, and characteristics © 
path of movement were all controlled- 


Method ji- 


e 
Apparatus. The general plan of the Fog 


: s re 
mental situation has been described in the P 
ing study, 


Visual Performance and Low Photopic Brightness Levels 


The subject was seated in the three-sided booth 
facing the front wall. In the front wall was a 
2 X 4 inch aperture over which was superimposed 
a double gradient neutral density filter with the 
center clear and increasing in density toward the 
ends. The stimulus grid was presented in this 

_ aperture and the double gradient neutral density 
filter acted to gradually blur the edges so that 
fee reference boundary was evident in the 
held. 

The stimulus grid consisted of high-contrast 
photographic reproductions of 2 mm. wide alter- 
nate black and white lines. The grid was a con- 
tinuous circular band, 2 inches in height and 18 
inches in circumference. It was carried on three 
rollers placed so as to form a triangle 71⁄2 inches 
on the aperture side, 54 inches and 47 inches 
on the other two sides. The rollers were 4 inch 
rubber covered dowls. The roller not in the 
aperture was driven by a 1/60 horsepower Gen- 
eral Electric A-C motor, model 5KH13E 19, 
type K.J., with a friction clutch, attachment in 
conjunction with a 1/100 reduction gear. The 
actual velocity of the driven shaft was recorded 
in R.P.M, by a tachometer. The velocity of the 
driven shaft could be changed continuously an 
smoothly through a range of 05 mm. per sec. 
© 2 mm. per sec. A masking motor was em- 
Ployed in conjunction with the apparatus so as 
© mask any possible noise cues. 

he ecpceim eit was seated at a small work 
table placed against the outside of the middle 
wall of the booth. The velocity regulating knob 
Was within easy reach and the dial face of the 
tachometer was directly in front of him. On the 
table before him were located the motion appa- 
ratus described above, a Variac and a voltmeter 
for control of the subject’s lights, 2 carefully 
hooded lamp to provide minimal illumination 
and a data sheet to record velocities and subject 
remarks. 

: The light sources and contro 
revious experiment. , 
Data sheet were prepared in advance as In the 

Previous experiment. A 

. Subjects. The same ten subjects Sea 


Is were as in the 


2 Seaamte heulae or mor speriments 
eir icipation in the two ex z 
cedure Bach subject was allowed to be- 


was turned on. With .05 foot- of 
ness and the stimulus velocity Very ow 
minal) the subject was instructed: 

ent to determine the influ- 
htnesses of illumination on 
You are to look at the 


center of the screen, look at three 7 fie 
Strips and say “Now” when you se 


s the 
Strips move in a regu ity al show: 
Screen. (Increase the sveloiaty Ning across 
threshold.) Now the strips are moving 


This is an experim' 
ence of varying brig 
the moving of strips. 


417 


Table 3 


Showing the Mean Tachometer Readings at the Point 
of Absolute Motion Threshold and Mean 
» Values for Minutes of Arc/Second 
at Five Brightness Levels 


Brightness in Foot-Lamberts 


Subject .005 .01 .05 .1 1.00 10.00 
1 im fa 13 I0 | 56 
2 1 U Et l S 
3 14 14 14 F 
4 15 15 i4 8 E l 
5 3 TS 33 7 6 
6 14 14 14 7 oy eo 
7 15 14 14 8 8 
8 i5 1 9 5 A 
9 16 13 13 7 a 
10 14 10 10 6 6 6 
Mean Tach. 142 130 1.25 .71 .60 .60 
Read. (RPM) 
Mean, in Min. 
of Arc/Sec. 40° 3.36 «4.35/20 {27 
o Tach. 0.10 0.15 0.17 0.13 0.13 
SE Tach. 0.03 0.05 0.06 0.04 0.04 


the screen in a regular fashion; say “Now” 
when you can’t see them moving in this regular 


fashion. 
Results 


The data of this experiment consist of 
judgments of the presence or absence of mo- 
tion (absolute motion perception thresholds) 
for ten subjects under five different illumina- 
tion levels. 

The means of ten judgments for each sub- 
ject are presented in Table 3 which shows for 
each subject not only the mean velocity in 
R.P.M. but also in minutes of arc per sec. at 
each of the five brightness levels. 

Inspection of Table 3 and Figure 2 sug- 
gests that: (1) the absolute motion percep- 
tion threshold is markedly lower at the higher 
brightness levels; (2) there is a sharp change 
in motion perception performance between 
.05 and .1 foot-lamberts; and (3) there is 
relatively little improvement in performance 
above 0.05 foot-lamberts. 

Since our principal concern is with per- 
formance as a function of brightness level, a 
t analysis was carried out comparing group 
performance for each pair of brightness levels. 


418 Milton L. Rock 


ABSOLUTE MOTION THRESHOLD AS 
A FUNCTION OF BRIGHTNESS 


ARC PER SEC 
e x § S 
a g # 8 


MEAN MINUTES OF 


-E 
2 o 
LOGT, IN FOOT LAMBERTS 


Fic. 2. Visual performance as a function of low 
photopic brightness levels, 


A summary of this analysis is presented in 
Table 4. From Table 4 it is seen that all 
differences which cross the .05 foot-lambert 
level are significant at the 1 per cent level 
while no difference that does not cross this 
value is significant at the 1 per cent level. 
Three other differences are significant at the 
5 per cent level: between .005 and 01; .005 
and .05; and .1 and 1.0. A supplementary 
test was made on three of the subjects at 10 
foot-lamberts; Table 3 shows that there is 
no difference between the means at this level 
and at 1 foot-lambert. 

On the basis of the data presented above 
it seems clear that: (1) motion perception 
performance increases sharply in the region 
of .05 foot-lamberts; and (2) relatively little 
increase in absolute motion threshold results 
from increasing brightness above this level. 

Practice Effects. The method employed in 
the investigation of practice effects was the 
same as in Experiment I. Each brightness 
level appeared in each ordinal Position an 
equal number of times. A ¢ analysis of the 
several differences indicated no significant 
practice effects in this experiment. The 
analysis of the ascending and descending 
series at each brightness level shows the 
ascending series to be greater in all cases 
than the descending series: these are errors 
of habituation. These errors of habituation 
prove to be small and relatively constant for 
all brightness levels. The differences were: 
.04 R.P.M. at .005 foot-lamberts, , è 
at .01 foot-lamberts, .18 RPM. 


lamberts, .10, R.P.M. at .1 foot-lamberts, 
and .08 R.P.M. at 1.00 foot-lamberts. , 

The subjects’ comments are of interest in 
this experimental situation. Subjective re- 
ports of pulsations in a plane perpendicular 
to the movement and pulsations in the plane 
of the movement were frequent. These pulsa- 
tions developed before motion was apparent, 
but in decreasing supraliminal motion to no 
motion the pulsations were usually not re- 
ported. The subjects typically believed their 
performance to be about equally good at all 
levels of brightness tested. 


Discussion 


The results of absolute motion threshold 
perception reported above indicate a critica 
level of brightness, between .05 and .1 foot- 
lamberts, below which subjects’ absolute mo- 
tion threshold is significantly raised. Above 
this level the absolute motion threshold is a 
minimum and a further increase in brigbt- 
ness, at least up to 10 foot-lamberts an 
probably indefinitely, produces no further a 
nificant increments of performance. an 
it seems as though once a subject has in 
given just enough brightness to perform me 
task with ease, brightness is no longer 2 S8 
nificant variable, ; 

The absolute motion thresholds found s 
this experiment ranged from 24 secs. of a 
per sec. at .005 foot-lamberts to 10 secs- A 
arc per sec. at 1 foot-lambert (and with thre 
subjects at 10 foot-lamberts). These rest y 
are lower than those usually stated as pies 
Generally one to two minutes of arc per $° . 
is the absolute motion threshold reporte 


Table 4 


ee: 
Values of b, Comparing Mean Tachometer Reading 
Point of Absolute Motion Threshold at 
Five Brightness Levels 


00 
.005 01 05 a: 
.005 = = = = = 
.01 2.44* _ =e — tt 
05 2.69% 1.63 = E 
-10 12.52** 12.94** — 40,82** a ee 
1.00 19.72%* 1417 — 13,99** 2.70 


a Significant 


at 5 ; 
Signifi per cent level. 


cant at 1 per cent level. 


| 


Visual Performance and Low Photopic Brightness Levels 


The source quoted is usually Aubert’s ex- 
periment, but his results should be qualified 
with, “under short fixation times,” for Aubert 
follows his results with: “. . . whereas with 
lower velocities it requires several seconds to 
detect motion” (16). With unlimited fixa- 
tion time Munch (39) reported thresholds 
as low as 34 secs. of arc per sec. and Basler 
(39) as low as 13 secs. of arc per sec. for 
foveal fixation under daylight conditions. 
This compares closely with the present re- 
sults. Thus, it would seem that under ordi- 
nary operational conditions with unlimited 
fixation time, brightness above .05 foot-lam- 
berts, binocular vision, cone adapted eyes, 
with a blurred but stationary reference in the 
visual field, and stimulus size subtending 10 
minutes of arc the absolute motion threshold 
is of the order of 10 sec. of arc per sec. Be- 
low a brightness of .05 foot-lamberts the mo- 
tion threshold increases rapidly up to 24 sec. 
of arc per sec. at .005 foot-lamberts. 


Experiment III. Depth Perception 


The two preceding experiments have pre- 
sented data concerning performance on a 
visual illusion and on a motion discrimina- 
tion task as a function of low photopic bright- 
hess levels. The present study is a further 
extension of these studies to performance on 
a depth perception task as a function of the 
same low photopic brightness levels. 

It is evident that it is impossible for the 
retinal image alone to give us & tridimensional 
Perception, for the retinal image 1S only = 
dimensional. Stereopsis can best be consid- 
ered a unification of many visual impressions 
and among the factors utilized toa gan 
or lesser extent are: (1) size of renna 
image, (2) aerial perspective, (3) on 
matical perspective, (4) distribution i 3) 
and shadows, (5) intervening objec 5 ; 
Convergence and accommodation, an ( 
Parallax. o 

T eon important factor contributing 
towards binocular stereopsis 1s the oe 
Non of binocular parallax. It is the a sie 
Of this factor in monocular vision that ren¢ S ; 
the estimation of depth so difficult, especially 


With the head fixed. 


419 


Method 


Apparatus. The experimental situation has 
been described above. 

For this experiment the stimulus field con- 
tained three dull white rods, the two outer rods 
fixed and the center rod movable. The rods 
were separated by 34” and were 2 mm. in diam- 
eter. Approximately one inch in the center re- 
gion of the rods was visible to the subject. The 
background was a matte black. The movable 
rod was moved by means of a rack and pinion by 
the experimenter. Brightness was controlled as 
in the previous experiments. 

Subjects. The same ten subjects who served 
in the previous studies were used in the present 
study. Two weeks or more elapsed between their 
participation in this experiment and the preced- 
ing one. 

Procedures. Each subject was allowed to be- 
come cone dark adapted (approximately 10 min- 
utes) before the illumination was turned on. The 
instructions were as follows: 


This is an experiment to determine the influ- 
ence of varying brightnesses of illumination on 
the performance of depth perception. You 
are to look at the three white rods. The cen- 
ter one will be moved back and forth; you are 
to tell me when you see it in the same plane 
as the other two. Now the center rod is back 
of the other two. Say “equal” when it ap- 
pears to be in the same plane as the two fixed 
rods; now tell me when it appears in front of 
the other rods. Now the rod is well in front 
of the other two rods. Say “equal” when it 
appears to you to be in the same plane as the 


Table 5 


Showing the Mean Constant Errors in Depth 
Perception in mm. Under Each of the 
Five Levels of Brightness 


Brightness in Foot-Lamberts 


Subjects 005 01 05 Ki 1.00 
1 32 24 -9 — 4 — 3 

2 30 39 Pi a) a 

3 47 32 Ki Ao =i 

4 1.4 1:2 -1 = 6 - 3 

5 4.3 3.5 6 «it 1.5 

6 3.3 mul —1.0 — 2 —.7 

7 3.2 fi — A = A = iB 

8 2.9 LF .9 0 ao 
9 7.5 2.8 2.2 Lak .9 
10 1.8 3.5 — «I = 4 -— 4 
Mean 3.53 2.54 0.20 0.18 04 
o 1.62 0.95 0.52 0.47 0.40 
SExy 0.54 0.32 0.17 0.16 0.13 


420 


others. Now tell me when it appears behind 
the other two rods. 


Five ascending (toward observer) and five de- 
scending (away from observer) trials were given 
before the formal trials were begun; this fore-test 
served to reduce the practice effect. 

On the formal trials each subject was given five 
ascending and five descending trials at each of 
five brightness levels. The method used was the 
traditional method of limits. 


The levels of brightness were the same as the 
preceding studies. 
Results 


The data of this experiment consist of con- 
stant errors, and average errors (non-alge- 
braic), made by the group of ten subjects 
under the five levels of brightness employed. 

The constant error data are presented in 
Table 5, which shows for each subject the 
mean of 20 judgments at each brightness 
level. Mean values for the group are thus 
based on 200 judgments at each level. The 
mean values from Table 5 are shown in Fig- 
ure 3 as a function of log brightness in foot- 
lamberts. 

Inspection of Table 5 and of Figure 3 in- 
dicates that: (1) performance is markedly 
more accurate at the higher brightness levels; 
(2) there is a relatively sharp change in per- 
formance between .01 and .05 foot-lamberts; 
and (3) the mean constant errors are all posi- 
tive in direction, i.e., at the judgment of 
equality the variable is in all cases in front 
of the two fixed rods. A ¢ analysis was carried 
out comparing group performance for each 
pair of brightness levels and a summary of 


ACCURACY OF DEPTH PERCEPTION 
AS A FUNCTION OF BRIGHTNESS 


AVERAGE ERROR 


MEAN ERRORS INMM 
cali 


CONSTANT ERROR 


-2 -l 
LOGT, IN FOOT LAMBERTS 


Visual performance as a function 
ke A of | 
photopic brightness levels, a 


Milton L. Rock 


Table 6 


Values of t, Comparing Constant Errors in Depth 
Perception at Five Brightness Levels 


Brightness in Foot-Lamberts 


.005 01 05 1,00 
.005 = 
-01 1.81 -— = i — 
.05 TBE ggg = = A 
á 931 gips 133 å = = 
1.00 7.76 8.93" 0.76 0.66 — 


** Significant at 1 per cent level. 


this analysis is presented in Table 6. From 
this table it is seen that all differences which 
cross the .01 foot-lambert level are significant 
at the 1 per cent level, while no difference that 
does not cross this value is significant at the 
5 per cent level. d 
The data for average errors are presente! 
in Table 7 and the ¢ analysis in Table 8 
Figure 3 presents the results graphically. The 
results are on the whole the same for the en 
kinds of error analysis. The only noticeab £ 
difference is that the average errors confinns 
to decrease above the critical brightness levé 
while the constant error values change Vety 
little above this point. Although (see Table 
8) the differences between .05 and .1 and be- 


Table 7 


Showing the Mean Average Errors in Depth 
Perception in mm. Under Each of the 
Five Levels of Brightness 


Brightness in Foot-Lamberts 


Subjects 005 o .05 J 1 

1 

1 55 59 44 4l A 

2 48 6000 29 23 ya 

3 5.2 4.1 3.3 24 3.0 

= 60 55 34 33 %2 

65 54 30 31 36 

s G1 52 36 46 a 

7 55 63 46 AA 

8 66 64 59 44 34 

3 4 71 46 51 %32 

10 66 66 56 5.6 3 

Mean 63 59 41 39 09 

3 15 j 10 1.0 3 
SEx Us p 


05 03 0.3 0.3 


Visual Performance and Low Photopic Brightness Levels 


Table 8 


Values of £ Comparing Average Errors in Depth 
Perception at Five Brightness Levels 


Brightness in Foot-Lamberts 


.005 01 05 1.00 
005 
O1 0.98 
05 asg Gare = - = 
al 6.15** 12.50** 1.00 — — 
1.00 5a 10.71% 340 227° — 


** Significant at 1 per cent level. 


tween .1 and 1.00 foot-lamberts are not sig- 
nificant, the difference between .05 and 1.00 
is significant at 1 per cent level. 

Analysis of the above data shows that: (1) 
Accuracy of depth perception performance 
decreases sharply below .05 foot-lamberts 
and (2) little increase in accuracy results 
from increases in brightness above this level. 

The results of this experiment, expressed 
in terms of angle of binocular parallax, are 
shown in Table 9. , 5 g 

Practice Effects. Practice or “warm-up 
effects were analyzed as in the other experi- 
ments, A £ test analysis of the several dif- 
ferences indicate no evidence of a practice 
effect for the experiment. i 

The subjects’ comments were of interest 
in that seven of the ten subjects made some 
reference to the apparent decrease in tri- 
angularity. All these seven subjects nie 
in various ways that the center movabl n 
Was equal when the apparent triangu arity 
was zero and that the fixed rods were used as 


the base standard. 


Discussion 


The results of the a 
error reported above clearly mae 
depth octane there is a See ha 

tightness between .01 and .05 foo mooi 

€pth perception is relatively difficu 


this level, Above this value the task becomes 


Suddenly easier and both constant Poe 
le errors decrease markedly. r a Fogi 

Creases in brightness—at least T EA 
amberts and probably indefinite yP 


no further increments of eae aes Muel- 
It is interesting to note fhatialtiong 


nt and variable 
ndicate that for 


421 


ler and Lloyd (20) state that stereoscopic 
acuity decreases in a regular fashion as in- 
tensity decreases, if one plots their results so 
that the actual data points are connected and 
the curve not smoothed a sharp break appears 
at approximately the same brightness level as 
in the present experiment. It is of interest 
also to note that the trend of their results 
would be highly similar to that of the present 
experiment, with performance leveling off 
above approximately .05 millilamberts. 

The absolute binocular parallax values 
found in this experiment for some subjects 
are lower than reported by the earlier experi- 
menters. Bourdon’s (2) 5 seconds was the 
lowest difference reported. Six of our ten 
subjects had five seconds or less at brightness 
level above the critical point of .05 foot- 
lamberts. The variability of these at the 
adequate performance brightness levels was 
great. 

These results add another item to our 
knowledge of critical visual performance levels 
as a function of low photopic brightness levels. 
Depth perception as well as dial reading per- 
formance, illusion errors and absolute motion 
thresholds appear to have approximately the 
same critical level of brightness, above which 
performance is adequate and an increase in 
brightness has relatively little effect and be- 
low which performance is relatively poor. 


Experiment IV. Addition Task 


A simple quantifiable mental task was in- 
vestigated to add to the picture of visual per- 
formance at low photopic brightness levels. 
Reading tasks are the first to come to mind, 
but two serious objections are inherent in the 
use of reading material. First, only time 


Table 9 


Showing the Mean Constant Errors and Average Errors 
as Angles of Binocular Parallax 


Constant Error 


Brightness, in 
(Sec. of Arc) 


Average Error 
Foot-Lamberts 


(Sec. of Arc) 


0.005 90: | 160 
0.01 60 155 
0.05 5 105 
0.1 5 100 
1.0 <5 90 


422 Milton L. Rock 


scores can be made with any reliability, and 
second, at low photopic levels the material 
would have to be made abnormally large in 
order to avoid the factor of visual acuity as 
a limiting variable. Two of the most produc- 
tive investigators in this field, Tinker and 
Luckiesh, have reported numerous studies, of 
which the two following are typical. Tinker 
(34), investigating reading of 10 point type 
at illuminations ranging from 0.1 to 53 foot 
candles (accuracy held constant), found speed 
of reading to increase rapidly from 0.1 to 3.0 
foot candles and no change in speed of read- 
ing between 3.0 to 53.0 foot candles. No 
change in accuracy of reading was reported. 
Luckiesh and Moss (18) in correlating illumi- 
nation intensity and nervous muscular tension 
resulting from reading 12 point type (large 
type) found the critical intensity level to be 
somewhere between 1 and 10 foot candles. 

Addition tasks have been used by many 
investigators as an active mental task, 
Thorndike (31) used addition scores of time 
and accuracy to investigate practice. Davis 
(8) used addition tasks to investigate the 
effect of noise on mental work, Rounds, 
Schubert, and Poffenberger (26) used addi- 
tion tasks to investigate the effects of practice 
upon the metabolic cost of mental work. 
Freeman (13) employed mental arithmetic 
as a mental task and measured it as a func- 
tion of spread of neuromuscular activity. 
Addition tasks have also been used by many 
other investigators as a mental task. Atkins 
(1) used arithmetic problems of the cancel- 
lation type as a mental task and measured it 
as a function of illumination. His range was 
from 9.6 to 118 foot candles. Performance 
measured by achievement was identical at all 
levels of brightness. 

In the present study performance as meas- 
ured by both speed and accuracy in addition 
problems was investigated as a function of 
low photopic brightness levels. 


Method 


Apparatus. The basic experimental situation 
has been described in the p: i i 


- It was double (11 x 14 
stimulus object was slid 
ew, another stimulus ob- 


ject came immediately into view. Micro-switches 
at each end of the track were arranged so that 
illumination on the stimulus object went off as 
the carrier was moved from one position and 
came on as it reached the other position. In this 
way the shift from one stimulus object to an- 
other was accomplished rapidly in a short a 
terval of darkness and did not require the sub- 
ject to make any shift in visual orientation. 
Thus, the subject was kept steadily at the oe 
level of illumination throughout a series of rend- 
ings, except for an instant of darkness between 
the presentation of stimulus objects. ly 
Materials. The stimulus objects, generously 
made available to this project by Dr. Mason 
Crook and Sam McLaughlin of Tufts College 
Air Forces project, consisted of high-contras 
photographic reproductions of 10 point monoty A 
numbers arranged into 100 items per chart, F 
item consisting of a 3-digit problem and its i 
digit sum arranged horizontally, There were fiv 
items per group, four groups per column, and five 
columns per chart. Each chart was reproduce? 
so as to have white figures on a black bac : 
ground and size was increased two-fold so tha 
each digit had an over-all height of 3/16 inches, 
The problems were selected with predetermine! 
specifications as to random numbers, repetitions, 
zeros, sums, et cetera, ious 
Subjects. The same subjects of the previo 
experiments participated in this experiment. nt 
week or more passed between this experime 
and the preceding experiment. ts 
Procedure, Each subject was given 10 cnie . 
at a brightness of 10 foot-lamberts on the day 
seen the day of the formal trials; this fore 
est served to reduce practice effects. Š z 
On the formal trials each subject did n 
charts (200 problems) at each of five brighton 
levels. While each subject was becoming CO 


j vere 
dn adapted the following instructions Wer 
read: 


This is an experiment to determine the, aed 
ence of varying brightnesses of illuminate? is. 
the performance in simple addition pronlar 
There are 5 columns of numbers, each cote 
Separated after 5 problems, The first ote 
numbers are to be added up and if their ee y 
equals the fourth number, you are tO ber 
“right”; if they add up to some other eee, 
you are to say “wrong.” You will say Pees 
after each 5 problems to indicate the SP% i 
on the charts; you are to say “new colu 

when you start each new column. (sub- 
When T say “ready” the lights will go ae be 
Ject is Preadapted to brightness level ilumi 
used by looking at a matte black chart i nt 
nated with this brightness) and in a A hts 
they will come on again. When the 4 i- 
come on, you are to add the numbers "each 
rected and say “right” or “wrong” to each 
problem; remember to say “space” after 


D a ——— 
ee _ TA 


Visual Performance and Low Photopic Brightness Levels 423 


Table 10 


Showing the Total Number of Problems in Error in 200 
t AdditionProblems at Each of Five 
Brightness Levels 
Note: The task was impossible_for all subjects at 
-005 foot-lamberts. 


Brightness in Foot-Lamberts 


Subjects 008.01 05 1 1.00 

1 30 29 1 2 1 

2 46 36 0 1 0 

3 48 21 1 0 1 

4 70 54 5 1 2 

5 12 6 1 1 2 

6 39 26 2 5 1 

50 34 2 3 3 

8 53 55 16 4 5 

9 20 23 6 2 2 

10 37 24 3 1 3 

Mean 40.5 30.8 3.7 2.0 2.0 

g 15.84 14.16 4.38 148 134 
.49 A5 


SEs 528 4.72 146 


set of 5 problems and remember to say iy 
column” when going from one column to the 
Next one, You are to add down the first 
column on the left and proceed to the right, 
Add as rapidly and as accurately as you can. 


ginally the same 
but all subjects 
er the .005 foot- 
el was raised to 
brightness better 


The brightness levels were ori 
as the preceding experiments, 
found the task impossible und 
ambert level, The lower lev 


‘008 foot-lamberts, at which 
than chance results were obtained. The levels 


finally used were .008, .01, .03, -1, and 1.00 foot- 
lamberts, ee of unexposed but developed 
Paper from the same stock as the stimulus ag 
Was used as an object for this ae am 
Controls to balance out practice an a i 
effects were the same as in the previous exp 


Ments, i 
Each subject reported for two Oe te oe 
Secutive days. At the first session su je age 
given the fore-test and on the secon nR ne 
ormal trials were given. Subjects were P ient. 
nowledge of results during the entire exp! 


Results 


The data of this experiment consist of error 


Broup of 10 subjects under 5 br tas hak 
t is interesting to note that at the | TEA 
lambert level all subjects felt the ca aa 
impossible and refused to attempt 1t, rep 
ing that the task would be guesswor 


feared injury to eyes. The lowest level was 
raised to .008 foot-lamberts where all subjects 
did better than chance, although still report- 
ing difficulty at this low level. 

Errors. The principal analysis of errors 
is in terms of error frequency, i.e., the num- 
ber of problems in error in 200 problems. 
These data are presented in Table 10, which 
shows the total number of problems in error 
in 200 addition problems at each of the five 
brightness levels. Mean values of the group 
are thus based on a total of 2,000 readings at 
each level. The mean values from Table 10 
are also shown as the accuracy curve of Fig- 
ure 4, 

From Table 10 and Figure 4 inspection in- 
dicates that: (1) performance is markedly 
more accurate at the higher brightness levels; 
(2) there is rapid improvement in perform- 
ance up to .05 foot-lamberts; and (3) little 
or no improvement above .05 foot-lamberts. 

A ¢ analysis was carried out comparing 
group performance for each pair of bright- 
ness levels. A summary of this analysis is 
presented in Table 11. From this table it is 
seen that all differences between .05 foot- 
lamberts and lower brightness levels are 
highly significant (1 per cent level) and all 
differences between .05 foot-lamberts and 
higher brightness levels are not significant at 
the 5 per cent level. 

From this analysis it seems clear that: (1) 
accuracy of performance in doing addition 
problems increases sharply in the region be- 
tween .01 and .05 foot-lamberts; and (2) 
little or no increase in accuracy results from 
increases in brightness above this level. 


SPEED ANO ACCURACY OF PERFORMANCE ON AN ACOITION TASK AS A FUNCTION 
OF BRIGHTNESS, 


e SPEED 
$ ACCURACY 


100 


30, 


IN SECONDS 
8 


MEAN ERRORS 


600] 


MEAN TIME 


m 2 
2 7 i 
LOG 1 IN FOOT LANGERTS 


Visual performance as a function of low 
photopic brightness levels. 


Fic. 4. 


424 Milton L. Rock 


Table 11 


Values. of t, Comparing Mean Number of Errors in 200 
Addition Problems at Five Brightness Levels 


Brightness in Foot-Lamberts 


-008 Oh 05 A 1.00 
-008 
OL 3.28" = = = = 
05 T22"** °6.84** = = = 
eh Cioe (6:26"* -i26 = — 
. 1.00 7.40** 640** 148 000 — 


** Significant at the 1 per cent level. 


Time. Data on the speed of performance 
in doing addition problems at the brightness 
levels studied consist of the time in seconds 
required to do 200 addition problems at each 
brightness level. Each subject did two charts 
of 100 problems each at each brightness level. 
Table 12 presents the total time required to 
do 200 problems for each subject at each 
brightness level, and also the group means. 
The mean values from Table 12 are also 
shown as the speed of response curve of 
Figure 4. 

Figure 4 shows that the curve for time 
scores has the same general shape as the error 
curve. There appears to be a sharp increase 
in performance between .01 and .05 foot-lam- 


Table 12 


Showing the Total Time in Seconds Required to Do 
Addition Task for 200 Problems at Each 
of Five Brightness Levels 


Brightness in Foot-Lamberts 


Subjects 008 01 05 1 100 
i 1569 1422 687 623 634 
2 1189 903 348 358 301 
3 1019 839 419 416 338 
4 964 1099 420 359 298 
5 622 57L 44 45 406 
6 1083 1007 367 389 356 
7 1394 1046 502 446 ang 
8 TIO 865 414 370 332 
9 727 694 448 agg 
10 1344 1105 ' 475 425" aig 

Mean 1068.1 955.1 449.5 4226 3920 
d 2942 2263 909 727 926 
SEx WAU 154 B03 22 Zog 


Table 13 


Values of t, Comparing Mean Time in Seconds to Do 
200 Addition Problems at Five 
Brightness Levels 


Brightness in Foot-Lamberts 


008 01 05 1 100 

.008 — — -- = g 
01 2.56* — = = a 
05 8.48** = 8.50** = — = ae 
mI 7.51 5.76%" 2.64"  — = 
1.00 8.08 si70% s28% 345% — 


* Significant at 5 per cent level. 
** Significant at 1 per cent level. 


berts. The ¢ analysis in Table 13 shows that 
all differences of means are significant at the 
5 per cent or 1 per cent level. The differ- 
ence between .008 and .01 foot-lamberts 1S 
significant at the 5 per cent level, and the 
difference between .05 and .1 foot-lamberts 1S 
significant at the 5 per cent level; all a 
other differences are significant at the 1 Pe 
cent level. Thus, it can be seen that eve? 
though the general results found for time 


- +. evidence . 
agree with those for errors, there is evidenc 


to support the statement that performance 
as measured by speed of doing addition ye 
increases significantly with increased bright 
ness up to 1.0 foot-lamberts. t 
Practice Effects. It will be recalled oe 
each subject did 1,000 addition problems pe 
fore formal trials were begun, in order to E 
uce practice effects. As in the previous i 
periments, in order to determine wel 
Practice effects were playing a significant r t 
the error and time scores were tabulated 7 
each subject in terms of first brightness lev 
tested, second level tested, etc. The t ie 
analysis showed no evidence of practice effec 


Discussion E 
. i a 
The results of the error scores indic 


ht- 
clearly that there is a critical level of brig o 


ness below which subjects find it ditheuk a 
Perform this addition task. Above this 1° to 
further increases in brightness, at least UP y 
1 foot-lambert and very probably indefinit 
produce no significant increments of pe” wait 
ance. These results agree completely 
those from the preceding experiments- 


f 


Visual Performance and Low Photopic Brightness Levels 425 


The results of the time scores, although 
comparable to the error scores in general 
trend, indicate that speed of performance in- 
creases as brightness increases, at least up 
to 1.00 foot-lambert. The increase shown by 
the significance of the several differences is 
not a steady increase but has its greatest rate 
between .01 and .05 foot-lamberts. 

These findings agree with the findings of 
the three preceding experiments in showing 
that performance in an active mental task as 
a function of brightness shows a critical value 
at the same general level of brightness. This 
critical brightness level is between .01 and 
.05 foot-lamberts which is considerably lower 
than the value of 3 foot-lamberts usually re- 
Ported as critical for reading performance. 
Tn such studies speed of reading has usually 
been the criterion of performance. It has 
already been noted that visual acuity may 
complicate the picture of reading at low pho- 
topic levels and it has been noted that in some 
studies accuracy did not change over a range 
of .1 to 53 foot-candles when words were large 
enough to be read. The time scores in this 
Study indicate an increase of performance up 
to 1 foot-lambert, which was the highest value 
used, but this increase was 4 differential in- 
crease with slower rates between 205 to .1 to 
1.0 foot-lamberts, and a sharp increase be- 
tween .01 and .05 foot-lamberts. Error scores 
in this study decreased rapidly up to 05 foot- 
lamberts and then leveled off, showing no 518- 
nificant decrease for the higher levels of 


brightness. 


Over-all Discussion 
ments have been con- 


cerned with visual perceptual perronen a 
a function of low photopic brightness | a 
The functions found, like those ai PAS 
reading studies from this laborato Sa 
markedly from the functions which a os 
quently been reported from Mie A 
effects of brightness on visual acul Matter 
foveal flicker fusion frequency: ka inves- 
functions have been found by guano i ro- 
tigators to increase steadily 7 2 oS oo 
Portional to the logarithm of the s 


Intensity. ne 
In contrast the results from the four 


The present experi 


periments reported here, as well as from the 
preceding dial reading experiments, indicate 
that performance improves as stimulus in- 
tensity increases only up to a certain point 
(approximately 0.05 foot-lamberts, depending 
somewhat on the specific task employed). 
Beyond this point increases in stimulus in- 
tensity are relatively unimportant in these 
experiments, the increments in performance 
being small and non-significant. 

These findings raise a number of interest- 
ing theoretical questions with respect to the 
physiological mechanisms and inter-relations 
which may be hypothesized to explain the 
present results. A detailed account of hy- 
potheses which might account for these find- 
ings, and especially of a possible rod-cone 
facilitation and inhibition relationship, will 
not be presented here. Such an account has 
been developed elsewhere in some detail.* 

In passing it may be of interest to note that 
a somewhat analogous situation is found in 
the field of audition. When per cent word 
articulation is plotted against stimulus in- 
tensity, there results a performance curve 
with sharp increase in the 10 db. region and 
a leveling off at about 20 db. (24). Myers 
and Harris (21) investigating the emergence 
of a tonal sensation with frequencies from 
500 to 14,000 cps. found a “zone of detecta- 
bility” (intensity area between a 50 per cent 
detection threshold and a 50 per cent pure 
tone threshold) between 2 to 4 db., inde- 
pendent of frequency. In their experiment 
with frequency matching, performance im- 
proved with increase in intensity only up to 
the level of 10 db. It would seem from 
these studies that in audition, as well as in 
visual tasks, there is a critical sensation level 
below which performance is increasingly poor 
and above which increases in stimulus in- 
tensity do not increase appreciably the sub- 
ject’s performance. 

From a practical standpoint the results of 
the present experiments suggest certain mini- 
mum values for adequate performance of 
visual tasks. From the present study and 
other available sources we can summarize for 


1In the original from which this report was re- 
written, on file as a doctoral dissertation in the Uni- 
versity of Rochester Library, 


426 Milton L. Rock 


a variety of visual tasks critical brightness 
levels, below which performance is impaired: 


Miiller-Lyer illusion between .01 — .05 F.L. 
Depth perception between .01 — .5 F.L. 
Motion discrimination between .05 — .1 F.L. 
Addition task between .01 — .05 F.L. 
Dial reading approx. .02 F.L. 


Critical fusion fre- 


quency (cone) .05 F.L. 
Span of apprehension 

(.032 sec. — 1 sec.) d =0§ EL 
Panel indicator lights between .01 — .1 e.f.c. 
Form silhouettes above .003 e.f.c. 


In view of the above findings it might seem 
advisable to consider .05 to .1 foot-lamberts 
(equals .1 e.f.c.), which is one of the highest 
values given, to be the limiting values to be 
used in practical situations. This indicates 
that in situations where the maximum quality 
and quantity of performance is required with 
the minimum brightness a value of approxi- 
mately .05 to .1 foot-lamberts should be em- 
ployed. Lighting of airplane cockpits, auto- 
motive and rail operator compartments and 
other situations which require good visual 
performance in the operator’s compartment 
plus adequate dark adaptation permitting for 
good form discrimination are situations to 
which this finding is relevant. 

Somewhat aside from the present data but 
as a rather interesting extension is the pro- 
posed use of a flood-type light yielding this 
critical brightness level in the operator’s com- 
partment (so situated as not to give reflec- 
tions from the windshield, etc.). This should 
serve to raise the adaptation level of the eyes 
to the critical level where form and silhouette 
discrimination is adequate. On-coming head- 
lights or disturbing flashes of various types 
should have less “blinding” or “dazzle” ef- 
fect because the adaptation change of the eyes 
would be less than that which is now required 
(from dark or near dark adaptation to bright 
on-coming lights or to flashes of lights). 
Since the visual performance that is n 


4 eeded 
by the operator is one of form, silhouette 
depth perception, motion acuity and mini. 


mization of illusion, etc., 
formance outside the comp 
also improve or at least not b 


his visual per- 
artment should 
e impaired. An 


emphasis on the physiological cause of “daz- 
zle” rather than the changing of the physical 
and optical constituents of headlights, wind- 
shields, gun flashes, etc., may be a more fruit- 
ful approach to the problem. 


Summary 


A systematic investigation of performance 
in visual tasks as a function of low photopic 
brightness levels was attempted. Four types 
of visual tasks were investigated: judgment 
of magnitude of an illusion, absolute thresh- 
old for motion, depth perception and a simple 
addition task. All tasks were investigated 
under five brightness levels in the range of 
.005 foot-lamberts to 1.00 foot-lamberts. I” 
each of the experiments, critical brightness 
levels were found below which performance 
was increasingly poor. Increased brightness 
above the critical level improved porta 
relatively little or not at all. The critica 
level for motion threshold was .1 foot-lam- 
berts; for the other tasks approximately -05 
foot-lamberts. It was suggested that 
maximum performance on visual tasks, Wit! 
minimum brightness, illumination should be 
adjusted to yield brightness values of .05 t° 
-1 foot-lamberts. 


Received May 28, 1953. 
Early publication, 


References 


1. Atkins, E. W. The efficiency of the eye ied 
ae intensities of illumination. J. 62” 
Sychol., 1927, 1, 1-37. 3 
- Bourdon, B. La perception visuelle de kepa 
Paris: Libraire C. Reinwald, Schleicher Fret’ 
Editeurs, 1902, Pp. 432. 

3. Brown, J. F. The visual perception of ve 
The thresholds for visual movement. 
chol. Forsch., 1931, 14, 190-232, 249-268. f 

4. Brown, J. F, and Mize, R. H. On the eie $ 
a nn ie on differential sensitivity. 
citol. Forsch., 1931, 15, 355-372. i- 

5. Cobb, P. W. Some experiments on speed al 19. 
sion. Trans. Ilum, Engng. Soc, 1924s 
150-175. ight- 

- Cobb, P, W, The relation between field DUS" 

ness and the speed of retinal impressions: 
exp. Psychol., 1925, 8, 77-108. d 

1. Craik, K. J. W. Legibility of different colonna 
instrument markings and illuminated sig” Ge 
low illuminations. Gt. Brit. Ministry. 

415, 15 January 1942, Pp. 4. 


Jocity: 
Psy- 


sT 


yh 


=— 


11, 


13, 


14, 


15, 


16. 


17, 


18, 


20, 


21, 


22, 


23, 


24, 


. Ferree, C. E. and Rand, G. 


. Luckiesh, M. and Moss, F 


Visual Performance and Low Photopic Brightness Levels 


. Davis, R. C. Modification of the galvanic re- 


flex by daily repetition of a stimulus. J. exp. 
Psychol., 1934, 17, 504-535. 


. Ferguson, H. H. and McKellar, T. P.H. The 


influence of chromatic light stimulation on the 

subsequent rate of perception under conditions 

of low illumination. Brit. J. Psychol, 1943- 

44, 34, 81-88. 

Intensity of light 
and speed of vision studied with special ref- 
erence to industrial situations. Part I. Trans. 
Illum. Engng. Soc., 1922, 17, 69-102. 

Ferree, C. E. and Rand, G. The effect of in- 
tensity of illumination on the near point of 
vision and a comparison of the effect for 
presbyopic and non-presbyopic eyes. Trans. 
Illum. Engng. Soc., 1933, 28, 590-611. 


. Ferree, C. E. and Rand, G. The effect of in- 


crease of intensity of light on the visual acuity 
of presbyopic and non-presbyopic eyes. Trans. 
Illum. Engng. Soc, 1934, 29, 293-313. 

Freeman, C. L. The speed of neuro-muscular 
activity during mental work. J. gen. Psychol., 
1931, 5, 479-494. 

Graham, C. H. and Hunter, w. S. Thresholds 
of illumination for visual discrimination of di- 
rection of movement and for the discrimina- 
tion of discreteness. J. ger. Psychol., 1931, 5, 
178-190. P : 

Hartline, H. K. Relative merits of lights of dif- 
ferent wave length in aircraft cockpit illumi- 
nation. USN.R.C—CAM. Report No. 10, 
June 1941, Pp. 1. ; , 

von Helmholtz, H. L. F. Treatise on physio- 
logical optics. Translated by J. P. C. South- 
all. J. Opt. Soc. Amer, 1925, 3, xt 688. 

Luckiesh, M. and Moss, F. K: Seeing: A part- 
nership of lighting and vision. Baltimore: 
Williams and Wilkins, 1931, Pp. 248. . 

Luckiesh, M. and Moss, F. K. A correlation 
between illumination intensity and nervous 
muscular tension resulting from visual effort. 


is , 16, 540-555. 
J. exp. Psychol., 1933 Reading as @ 


7 Tostra 
oual tash ‘New York: D. Van Nostrand 
Co., 1942, 315-335. e sani 
Mueller, C. G. and Lloyd, V: vs eo 
acuity for various levels of illumination. ji 
Nat. Acad. Sci, 1948, 34 A ergence 
Myers, C. K. and Harris, J. | . The ei be 
of a tonal sensation. Medical eae Gre 
U. S. Submarine Base, New London, 31 Marc's 


1938, Pp. 10. , 7 i 
McFarland, R. A. Human factors itt z ten 
bort design. New York: McGraw-Hill, i 
433—486. i 
McFarland, R. A., Knebr, > 
Metabolism and pulse rate as raii ee 
ing under high and low levels of illum 3 


65-15. 
J. exp. Psychol, 1939 25, r t. 
Office r Scientific Research a panpe 
Summary technical report © n and recep- 


N.D.R.C. Vol. MI. Transmissio 


G& As and Berens, les 


tw 
On 


30. 


39. 


40. 


41. 


2. 
. Tho: 
. Tinker, M. A. 


. Tinker, M. A. 


427 


tion of sounds under combat conditions. 


Washington, 1946, 69-108. 


. Rock, M. L. Annotated bibliography on visual 


perjormance at low photopic illumination 
levels. U.S.A.F. Air Materiel Command. AF 
Technical Report 6013, November 1950. Pp. 
31. 


. Rounds, G., Schubert, H., and Poffenberger, A.T. 


Effects of practice upon the metabolic cost of 
mental work. J. gen. Psychol., 1932, 7, 65-79. 


. Senders, V. L. The physiological basis of visual 


acuity. Psychol. Bull., 1948, 45, 465-490. 


. Sheard, C. The effects of intensity of illumina- 


tion on presbyopia, accommodation and con- 
vergence. Amer. J. Opt., 1936, 13, 241-254. 


. Spragg, S. D. S. and Rock, M. L. Dial reading 


perjormance as related to illumination vari- 
ables. I. Intensity. USAF., Air Materiel 
Command, Memorandum Report MCREXD- 
694-21. 1 October 1948. Pp. 32. 

Spragg, S. D. S. and Rock, M. L. Dial reading 
perjormance as related to illumination vari- 
ables. II. Spectral distribution. U.S.A. F., Air 
Materiel Command, Memorandum Report 
MCREXD-694-21A. 1 December, 1948, Pp. 

3. 

rndike, E. L. Practice in the case of addition. 

Amer. J. Psychol., 1910, 21, 483-486. 

Illumination and the hygiene of 

reading. J. educ. Psychol., 1934, 35, 669-680. 

Illumination intensities for read- 


ing. Amer. J. Ophthal., 1935, 18, 1036-1038. 


. Tinker, M. A. Illumination standards for effec- 


tive and comfortable vision. J. consult. Psy- 
chol., 1939, 3, 11-20. 


. Tinker, M. A. The effect of illumination intensi- 


ties upon fatigue in reading. J. educ. Psy- 
chol., 1939, 30, 561-571. 


. Tinker, M. A. The effect of adaptation upon 


visual efficiency in illumination studies. Amer. 
J. Optom., 1942, 19, 143-151. 


. Tinker, M. A. Criteria for determining the read- 


ability of type face. J. educ. Psychol., 1944, 
35, 385-396. 

Tinker, M. A. Effect of visual adaptation upon 
intensity of illumination preferred for reading 
with direct light. J. appl. Psychol., 1945, 29, 
471-476. 

Titchener, E. B. Experimental psychology. Vol. 
I. New York: The Macmillan Co., 1901, 309, 
313, 321-327. 

Weston, H. C. and Taylor, A. K. The relation 
between illumination and efficiency in fine 
work. (Typesetting by hand.) London: H. 
M. Stationery Office, 1933, Pp. 24. 

White, L. R., Britten, R. H., Ives, J. E., and 
Thompson, L. B. Studies in illumination. II. 
Relation of illumination to ocular efficiency 
and ocular fatigue among the letter separators 
in the Chicago Post Office. Pub. Hlth. Bull., 
No. 181, Wash. Gov. Printing Office, 1929, Pp. 


58. 


. Woodworth, R. S. Experimental psychology. 


New York: Holt and Co., 1938, 647. 


THE JOURNAL or APPLIED PSYCHOLOGY 
Vol. 37, No. 5, 1953 


Applied Psychology in Action 


Evaluating Supervisory Training at the Job Performance Level 


Theodore R. Lindbom 


Personnel Department, Midland Cooperative Wholesale, Minneapolis, Minn. 


Evaluation of college and university courses, 
of practical necessity, ordinarily ends with the 
semester-end examination. The assumption 
is made, rightly or wrongly, that performance 
in the examination is correlated with perform- 
ance in future situations where course content 
can and should be applied. This is the re- 
port of the results of an attempt to evaluate, 
beyond the classroom level, the performance 
of a group of University of Minnesota General 
Extension Division students who had taken a 
course in supervision. The course is a discus- 
sion type course in human relations for super- 
visors, with emphasis on the recognition of 
individual differences and the “human ele- 
ment” in supervision, which runs for a semes- 
ter and consists of 16 evening meetings each 
1% hrs. in length. Practically all students 
were employed full time with about two-thirds 
in supervisory capacities. . The group studied 
were students during the spring and fall 
semesters of 1950 and the spring semester of 
1951 in 5 different sections of the class all 
taught by the writer. This evaluation was 
made in addition to traditional classroom ex- 
aminations and test-retest with the standard 
test, “How Supervise?”, 

A mailed questionnaire, sent in March, 
1952 with one follow-up, produced 66 returns 
from 129 students. Of these 66, 41 were from 
persons in supervisory jobs. The analysis of 
these 41 returns is presented here, 

The 2 major questions 
cerned with behavior chan 
(1) changed behavior of 
the-job; and (2) change 
people he supervised resy 
methods of supervision, 

In answer to the question, “Is there any- 
thing that you are now doing differently—as 
a foreman or supervisor—hbecause of your ex- 
perience in these discussions?” 63% answered 
yes, 27% no, and 10% did not answer or 


asked were con- 
ges at two levels: 
the supervisor on- 
d behavior of the 
Iting from changed 


428 


were undecided. Typical of the comments in 
answer to this question were: A 
“More consideration of the employees 
problems.” 
“When employees are by-passed for up- 
grading, an explanation is given them.” 
“I am spending more time determining the 
facts when grievances arise.” k 
“Realizing the personality differences 
people and using that in dealing with people. 
In answer to the question, “Is there any- 
thing about the people you supervise now that 
is different from the way they were before— 
anything that has resulted from changes ot 
your operations due to taking ‘Elements 0 
Supervision’?” 44% answered yes, 29% NO 
and 27% did not answer or were undecided. 
Typical of the comments made when respond- 
ents were asked to describe these changes 19 
the people they supervised were: 3 
“Morale is better than in other divisions, 
not to speak of increased efficiency.” A 
“They show more of an attitude of work- 
ing ‘with’ rather than Toa 1 
“My employees come to me for help in at- 
Most any type of problem.” oe 
“Lower costs of operation due to willing 
ness to cooperate with their foreman an 
themselves,” 
Although results indicate that the or 
Was successful, at least to some degree, t 
study design permits neither definite conclt 
Sins nor generalization, The group to ee 
with was a highly selected one, and the P? 
cent return of questionnaires is low ioe 
to allow an additional selection factor to P 
Operating. Time between completion of d 
course and filling out the questionnaire me 
from 10 to 22 months. Conscious or pase? 
seot misrepresentation of facts by respo” 
ents is also g possible factor. ; is 
Because of these limitations, the study 
not reported for its specific findings O" 


ip 


4 


Ñ 


Applied Psychology in Action 429 


generalization. Instead, it is presented as an 
illustration of an easy-to-make evaluation at 
a level beyond the traditional classroom ex- 
amination which appears to measure more di- 


THE JOURNAL OF APPLIED PSYCHOLOGY 
Vol. 37, No. 5, 1953 


rectly the kind of behavior change which the 
course was intended to bring about—the on- 
the-job behavior of the supervisor and the 
people he supervises. 


Criterion Rationale for a Personnel Research Program 


Theodore R. Vallance, Albert S. Glickman, and George J. Suci 


American Institute for Research 1 


The authors have been engaged in setting 


up and putting into operation a program for 
personnel research with naval officers. In any 
activity of this kind, it is vital to develop a 
rational framework or “research constitution” 
for the program. Since it appears that the 
framework we have developed contains many 
elements which have general applicability in 
the planning of personnel research in other 
educational, industrial, or military settings, 
the substance of our criterion rationale is pre- 
sented here. 


Ultimate and Intermediate Criteria 


When initiating a program of personnel re- 
Search we must first ask: What is the nature 
Of the criterion? Until the criterion is de- 
fined, assessment of the program, or any of 
its parts, is not possible. The answer to the 
question is reflected most ultimately in terms 
of furthering the objectives of the organiza- 
tion. For the Navy the long-run goals are: 

“1. To defend and support the Constitution 
Of the United States against all enemies. — 

2. To maintain, by timely and gore 
military action, the security of the United 
States, its possessions and areas vital to i 
interest. 

_ 3. To uphold and advance nation 
“les of the United States. 

4. To safeguard the interna 
the United States.” ? 

The “success” of any organ 


cat the Officer Personnel Re- 
seach Project TET. 7 Naval Sahana ce eet 

ewport, R. T. under contract Nonr 890(00 aes: 
NS American Institute for Research and the 

aval Research. 

“Key West Agreement, 1948. 


al poli- 
1 security of 


jzation is meas- 


ured by the extent to which its objectives are 
achieved. However, the fulfillment of “ulti- 
mate” organizational aims can seldom, if ever, 
be directly assessed. Other outcomes, less re- 
mote, more susceptible to measurement, must 
normally be used to evaluate day-to-day 
operations. Historical hindsight and logical 
analysis are the usual standards for assuming 
correlation between ¢he ultimate criterion and 
subordinate “intermediate” criteria. Contri- 
butions to success are then measured at many 
levels presumed to be correlated with the more 
ultimate criterion. 

The practical research problem, then, is to 
determine the highest organizational level at 
which quantifiable measures considered to be 
reflections of personnel behaviors are possi- 
ble. In our naval model, ships or command 
units ashore comparable in size, complexity, 
and autonomy, represent the organizational 
entities upon whose efficiency the performance 
of a given crew member may be expected to 
have a measurable effect. Consequently, it 
was taken as a basic assumption that for the 
measurement of success of individuals, : the 
performance of ships or comparable shore 
units represents the highest level at which 
meaningful and practical quantitative cri- 
terion measurement can be established. As 
such these performances also comprise the 
basis for determining the validity and prac- 
tical meaningfulness of subordinate criteria 
—for departments, divisions, smaller groups, 
and individuals. 


Criteria of Individual Success 


Several questions which arise during the 
process of evaluating naval personnel are be- 


430 


lieved to have bearing in programs of evalua- 
tion for other kinds of administrators, and 
executives. These are discussed below. 

What is “successful performance” for the 
individual? At what rank, or after what time 
on the job, can or should evaluation be made? 

The possibility must be recognized that not 
all officers who are competent at one level of 
rank or responsibility will be equally com- 
petent at the next or other higher levels. (It 
has not been shown that a good ensign neces- 
sarily makes a good admiral.) The question 
then arises: Is “success” composed of the 
same factors at each level? If not, we are 
confronted with constantly shifting criteria 
and must choose the intermediate criterion 
level that is desirable for a particular class of 
officers on logical or empirical grounds. 

As the correlation between criteria for the 
several ranks decreases, the risk increases that 
the selection variables validated against lower- 
level rank criteria will be unrelated to higher- 
level rank criteria, or indeed may be nega- 
tively related to them. 

We are thus confronted with the question: 
When are we to define “success” as having 
been achieved? Is it at Officer Candidate 
School, or when an ensign, or when a com- 
mander, or when a chief-of-staff, or is per- 
formance in the next superior rank an ade- 
quate criterion against which to evaluate per- 
formance at immediately subordinate ranks? 

We are also confronted with the related 
problem of deciding what criterion levels to 
choose for assessing the effectiveness of train- 
ing. That is, by what performance standards, 
at what level of responsibility, should the 
adequacy of training at Officer Candidate 
School and elsewhere be judged? Should per- 
formance right after schooling be taken as 
the measure of training effectiveness, or should 
evaluations be made after some specified time 
has elapsed? 

Likewise, with regard to selection, 
ment, promotion, retirement, and co; 
there arises the question of where to 
appropriate standards. 

Competency in an executive 
usually considered to be highly co: 
rank, Ideally, 


assign- 
mmand, 
look for 


hierarchy js 


trelated with 
rank and competence in rele- 


Applied Psychology in Action 


vant areas should be perfectly correlated. 
To the extent that the correlation is less than 
1.00, room exists for improvement of tech- 
niques for evaluating training and duty per- 
formance, and for assignment to jobs. 

It must be recognized that much of the 
preceding involves policy decisions at a high 
level and consists of questions which cannot 
be answered by a research unit. Lack of such 
policy decisions leaves the research goals in 
doubt, leads to confused direction, and lowers 
the utility of research products. 


Criterion Methodology 


The relative status of a variable as a pre- 
dictor or a criterion is in many cases simply 
dependent upon the chronological sequence M 
which variables may be organized. Each in- 
termediate performance criterion presumably 
should be correlated with a more ultimate 
criterion and hence serve as a predictor of it. 
As demanded by exigencies, many perform- 
ance measures can be considered either ct 
teria or predictors. 

Comparability of measures of performance 
is a basic requirement if such measures are 
be used effectively. If success is determine“ 
at all ranks and in all duties by the same fac- 
tors, with differences from rank to rank rep” 
resenting only variations in degree rather es 
kind of factors involved, then the approach is 
evaluation is relatively simple. Although m 
1S not likely to be true in most cases, it ee 
volves Primarily attempts to increase 1° n 
bility of measures of the factors demonstrate 
to possess the highest validity as criteria. | 

Questions of the comparability of crite? 
measures then crop up with respect tO “ 
aspects of criterion measurement and we @ r 
faced with the problem of how to gie 
criterion measures equivalent in considerati? 
of differences of rank, raters, ship types» du 3 
hazard, kind of subordinates, and other sit¥ 
tional factors. he 

Applicable to all of the foregoing 1 oe 
question: What is the line of demarcation r- 
tween satisfactory and unsatisfactory P y 
formance? Does the standard of satisfact© 
and Unsatisfactory fluctuate as a functio” 
any, or Several, of the above? 


by 


i! 


Ts 


Applied Psychology in Action 431 


Tllustrative Criterion Measures 


Individual job performance criteria may be 
classified in many ways, dictated by the in- 
stitution’s goals and organization. For naval 
officers we have organized them under two 
broad headings: technical skill and human 
relations skill. 

Each of these sets of skills in turn may be 
considered as effectors of success at several 
levels in the operational hierarchy, which in 


the Navy would be the ship, the group (de- 
partmental, divisional, or other), and the in- 
dividual. 

These may further be sub-classified as to 
whether they are demonstrated under train- 
ing conditions or on-the-job. 

Finally these criteria may be further speci- 
fied according to the type of measure being 
applied, as schematized, for example, in Chap- 
ter 5 of R. L. Thorndike’s Personnel Selec- 
tion, Wiley, 1949. 


How’s Your Empathy? 


“Empathy” is a word that has been in the 
dictionaries a long time but it’s just beginning 
to gain recognition as an important quality 
for executives at all levels. One dictionary 
definition is “the imaginative projection of 
one’s own consciousness into another being” 
but as used by psychological consultants in 
business and industry empathy is used to in- 
dicate the ability to imaginatively project the 
other fellow’s consciousness into your own, 
thereby putting yourself mentally into his 
Shoes, to the point of being able to guess 
Pretty closely what his thoughts and reactions 
will be in a given situation. , : 

In its simplest form empathy is well illus- 
trated by the old story of the village idiot 
who found the lost horse when nobody else 
could. He just sat down and figured = 
Where he would go if he were a horse. e 
Went there and there was the hor ; 

At a recent meeting of chemical ne 
Dr. Richard S. Schultz, a New Y ork psycho- 
ogical consultant, mentioned the ippon an 
of empathy as an executive quality, pointing 
out that individuals who rate high in this char- 
acteristic can more readily understand, pre- 
dict and control the thinking, feeling, pi 
actions of other people. Psychologists pii 
now at work devising methods of measuring 


this quality. 


rse. 


“The simplest illustration of empathy is to 
recall your last experience at an exciting 
athletic event or theater show,” he said. 
“Remember how you reacted and identified 
yourself with specific thoughts, feelings, and 
actions of the feature personalities?” 

Empathy, he said, may be further described 
as a combination of social sensitivity and 
social intelligence. “It is with such awareness 
that we can be most skillful in our daily con- 
tacts with people,” he said. 

It is encouraging to know that progress is 
being made in measuring the characteristic 
technically known as empathy, for it is a 
trait that under various inexact tags has been 
recognized as an important though elusive 
attribute of success. It is often the reason 
why two men of apparently equal upbringing, 
education, intelligence, and opportunity will 
vary so widely in their degree of business suc- 
cess: one of them can see things from the 
other fellow’s point of view; the other can’t, 
and continually rubs people the wrong way. 

If accurate measurements are on the way 
that will measure this sort of “social savvy” 
they will be of particular use to the insurance 
business, in which cooperating with people 
and getting along with them on a good basis 
are so much more important than purely tech- 
nical know-how. (The National Underwriter, 


July 3, 1953.) 


Book Reviews 


Maier, Norman R. F. Principles of human 
relations, applications to management. 
New York: John Wiley & Sons, Inc., 1952. 
Pp. ix + 474. $6.00. 


This book appears to fulfill three purposes, 
(1) to present Dr. Maier’s research and ex- 
perience with human relations training pro- 
grams in industry, (2) to furnish a basic text- 
book for courses in human relations in indus- 
try with material adaptable to laboratory 
exercises, and (3) to serve as a manual to 
guide industrial psychologists in introducing 
human relations programs in business and in- 
dustry. Although the systematic discussion is 
based primarily upon Dr. Maier’s own work, 
the conclusions that he reaches are similar to 
those which have been obtained previously by 
others in the area of human relations. 

The material deals mainly with methods 
and techniques for a human relations 
program, including the use of 
methods, role playing, and group decision pro- 
cedures. The use of such techniques is aimed 
at overcoming hostility, fears, feelings of in- 
security, frustrations, and other barriers to 
acceptance of democratic supervisory prac- 
tices. In addition, how to assist the super- 
visor to be permissive in his dealing with 
individuals and to use non-directive counsel- 
ing techniques is discussed at some length. 
Ample case material is furnished to provide 
demonstrations of the value of the various 
methods and techniques discussed. Of par- 
ticular value is an exposition of how group 
discussion and role playing techniques can be 
adapted for use with large groups when it js 
not possible to use small groups in training, 

Concepts employed are well defined and 
explained. The general ease of reading is 
marred only by an occasional awkward sen- 
tence, and failure to Provide adequate transi- 
tion from one idea to another, ; 

The major emphasis of the book is aimed 
at explaining and furnishing demonstrations 
of the value of group discussion and role play- 
ing for supervisory training at all levels from 
top management to line supervisors, Through 
the use of such techniques it is Possible to 
change a supervisor’s feelings and attitudes 
which conflict with maintaining good human 
relations with his group of workers. The use 


training 
group discussion 


432 


of role playing in training situations directs a g 
supervisor’s attention away from the words 
or logic which are overtly expressed and | 
focuses it upon the feelings which govern the 
course of interpersonal relations. Most am- 
portant of course he comes to understand his 
own involvement in the process. Only through 
gaining insight in regard to how feelings in- 
fluence the tenor of interpersonal relations and 
how they determine the kind of mutual un- e 
derstanding which results can a supervisor, if 
necessary, come to appreciate and accept new 
modes of behavior appropriate for dealing 
effectively with other people. Role playing 
provides a means of gaining experience under 
conditions which do not require a supervisor / 
to “save face.” Consequently, a situation is j 
provided where he can examine objectively 
how supervisory attitudes, both good and bad, 
influence the course and outcome of inter- 
personal relations. He comes to realize that 
being permissive is more advantageous for 0b- 
taining constructive actions with a concurrent 
improvement in his status with the workers i” 
regard to their respect for his authority, CO" 
trol and prestige. Such an outcome remo 
the hampering effect of presumed risks whic! 
a supervisor imagines might endanger Piz 
capacity to carry out his responsibilities if 
adopts democratic practices. Once he is i 
of concern for the necessity of protecting r 
own security, a supervisor is able to let A 
group solve its own problems under his gu! 7 
ance. When increased effectiveness of ae 
group in accomplishing work ensues, the 
Pervisor is enabled to realize the value 7 
exercising democratic control by utilizing t 
forces which are in the group rather than 
depending upon the use of his power. of 
Dr. Maier’s book is a good illustration 
the basic contribution that psychology oy 
make toward developing a realistic philosoP in 
of making life and work tolerable in moe x! 
industrial society, True, the basic prince, é 
of such a Philosophy still go back to Ae i 
Plato, the Christian ethics, and the Eng ag 
common law. But psychology, by onila 
the scientific method, can still contribute © 
terially to their realization by furnls d 
proof of the effectiveness of various met te 
and techniques which when fully assimila 


== = — 


Book Reviews 


into the mores of our society will come even- 
tually to be considered common sense ap- 
Proaches to maintaining good human rela- 
tions. It is fortunate that psychology has 
men with the capability and insight of Dr. 
Maier, whose approach to developing an ap- 
plied science of human behavior tr 
the bonds of narrow specialization. 
Wilton P. Chase 
Air Research and Development Command, 


Human Resources Research Center, 
Lowry Air Force Base, Denver, Colorado 


anscends 


Deese, James. The psychology of learning. 
New York: McGraw-Hill, 1952. Pp. x 
+ 398. $5.00. i 
Since this book is designed as a text (for 

advanced undergraduate and graduate stu- 

dents), one of the difficulties inherent in all 
textbook reviews is encountered here: The 

Professional reader will find in it much that 

is already familiar, and low in interest value, 

but which the student may react to very dif- 
ferently. The reviewer must therefore try, 
as best he can, to look at the book through 

Student eyes. When this is done, the present 

Volume stands up well. It is clearly and 

simply written, with an occasional colorful 

turn of phrase. It is, moreover, comprehen- 

Sive in scope—including, as it does, discussion 

of animal and human learning in both labora- 

tory and everyday (clinical, applied) sertings 

—and yet wisely avoids trying to be encyclo- 

Pedic in coverage: “broad rather than exhaus- 

tive” is the author’s avowed aim. While the 

Ok is not founded upon or integrated around 

any one conception of learning, and thus not 

#0 Provocative as it might otherwise be, it has 

€ merit of being accurate, critical, and e 

Yery possibly inspiring students to get right 

to work on research designed. 1 fill in the 
Ore glaring gaps in our knowledge. 

b erhaps T ice short-coming of the 

Ook is the author’s failure to interrelate dis- 

cussions in different chapters. Not infre- 

Wently a given piece of research or pe 

Will be discussed, apa hone 
One chapter; and yet, in a later 

Chapter, this Sine: will not be brought to 

bear upon problems where it is rather ga 

ously relevant. A similar criticism: 1$ also 

Ccasiona]ly appropriate with respect to ex- 


433 


perimental facts and hypotheses available in 


the literature which have not been included 
in the book at all. The net result is that the 


book lacks more in total impact and cagency, 
than it needs to, Beomere: itis a COM Gan, 


and in later editions may wall Wise to WEA 
the challenge of its topic more fully than it 
presently does. ý 


The author is well aware of the applied?” 
potentialities of the psychology of Jeanmine 
and refers from time to time to swch ttolds as 
education and psychotherapy. However, he 
is conservative in what he believes labora- 
tory fact and theory can at present contribute 
along these lines. He very usefully points to 
some of the not very happy results of prem 
ture application of particular conceptions of 
learning and urges further inquiry rather than 
rash “practicality.” 

While The Psychology of Learning puts a 
desirable emphasis upon laboratory proce- 
dures as a source of knowledge in this area, it 
tends to slight what is already known and can 
be further learned in the “applied” setting. 
In other words, it does not emphasize as much 
as it might the reciprocal benefits of interac- 
tion between laboratory and field. It some- 
times seems to imply that all knowledge origi- 
nates in the laboratory and is then channeled 
toward application in the field. The author 
properly notes that laboratory theories some- 
times fall on their face in a practical setting, 
but he does not, in the reviewer’s judgment, 
give the field proper credit as itself a kind of 
“laboratory” and certainly a setting which 
can provide highly stimulating questions and 
suggestions to be carried back for more rigor- 
ous types of investigation. 

The Psychology of Learning is an excellent 
job of bookmaking; and despite its being 
pitched at the textbook level, professional 
readers will find parts of it novel and exciting, 


O. Hobart Mowrer 


University of Mlmors 


Ulrich, David N., Booz, Donald R., and 
Lawrence, Paul R. Management behavior 
and foreman attitude. Boston: Harvard 
. Business School, 1950. Pp. 56. $.75. 


This report is the result of an 8 month case 
study made by a research team consisting of 


434 


the three authors. The study was carried out 
through informal observation and interviews 
in a manufacturing firm employing about 500 
persons located in a large eastern city. About 
half the time of the study was spent in ob- 
servation of an assembly department of 36 
female employees and their foreman. 

As the title implies, the main object of the 
study was to determine what effects the be- 
havior of top management had on the fore- 
man. In addition, the effects of the behavior 
of other groups on the foreman, including his 
employees, staff specialists, and his immedi- 
ate superior were also studied. 

A number of difficult and strained relation- 
ships at all levels of the organization are 
pointed out, causes of these difficulties hy- 
pothesized, and recommendations made on 
how the relationships could be improved. A 
major recommendation made is that top man- 
agement make greater efforts to understand 
the effects of administrative action on em- 
ployees and supervisors. 

Because the only evidence given to back up 
what is said consists of scattered selected 
quotations of remarks made by those ob- 
served, the reader will find he is being asked 
to accept the conclusions and recommenda- 
tions pretty much on faith in the analytical 
ability of the researchers, Despite this limi- 
tation, however, few readers who deal with 
similar problems will finish this report without 
some new insights into these problems and 
new ideas for dealing with them in their own 


situations. Theodore R. Lindbom 
Personnel Department, 


Midland Cooperative Wholesale, 
Minneapolis, Minn. 


Weinland, James D. a 


a nd Goss, Margaret V. 
Personnel interview. 


ng. New York: Ronald 
Press, 1952, Pp. vii + 416. $6.00. 

_ This book deals with the aims and tech- 
niques of business interviewing and is ad- 


Book Reviews 


dressed to individuals concerned with per- Mi 
sonnel relations and employment. Although 
many of the principles and procedures are ap- 
plicable to all types of personnel interviewing, 
the book emphasizes employment. Chapters 
are devoted, however, to other types of inter- 
viewing, such as merit rating, disciplinary, 
counseling, etc. ; 

A section on the interviewer and his wor’ 
deals with introductory and background ma- + 
terial, ranging from individual differences - 
interviewing environment and the training O 
interviewers. A second part deals with tech- 
niques, including material on directive, ot 
directive, and patterned interviews. The oie 
part of the book deals with interviews 0 
various purposes. ia ah 

Although the book contains much of va | 


and has interesting material and views; E 
over-all effect is disappointing. Perhaps OP 
reason for disappointment is the great A 
for a comprehensive and up-to-date text ™ 
the field of personnel interviewing. ‘The eer | 
ters of this book get off to a good start, pu 
the reviewer had a feeling of disappointmen 
at the end of each. This was due partly g 
failure of the authors to organize and e 
tematize the material adequately. kag e 
true in spite of their predilection for Teis 
and classifications. Unfortunately, such |! 
often appeared incomplete or haphazard. ie 
The authors are guilty of looseness, = f 
biguity and over-generalization. One T 4 
that some dogmatically worded eee 
would be considered better as hypotheses | 
as proved facts. ` 
The reviewer’s over-all opinion is ind a 
by the fact that although he is currently i. 
ing interviewers, he is not using this houg” 
Other materials are being used even = Get 
older, or available only in less accessible 
f 


icated 
ain 


en 
Clifford E. Jurgens 
Minneapolis Gas Company 


* 


New Books, Monographs, and Pamphlets 


Books, monographs, and pamphlets for listing and possible review should be sent to Donald G. Paterson, 
Editor, Department of Psychology, University of Minnesota, Minneapolis 14, Minnesota. 


The workshop handbook. Walter A. Ander- 
son, Rollin P. Baldwin, and Mary Beau- 
champ. New York: Columbia University, 
1953. Pp. 65. $1.00. 

Adjustment to physical handicap and illness: 
Al survey of social psychology of physique 
and disability. Roger G. Barker, Beatrice 
A. Wright, Lee Meyerson, and Mollie R. 
Gonick. New York: Social Science Re- 
search Council, 1953. Pp. 440. $2.00. 

Differential migration in the corn and cotton 
belts. Donald J. Bogue and Margaret Jar- 
man Hagood. Oxford: Scripps Foundation, 
1953, Pp. 248. $2.25. 

The fourth mental measurements yearbook. 
Oscar K. Buros, Editor. Highland Park, 
N. J.: The Gryphon Press, 1953. Pp. 
1,189. ` $18.00. 

Group dynamics. Dorwin Cartwright and 
Alvin Zander. Evanston: Row, Peterson 
and Company, 1953. Pp. 642. 

Human behavior: psychology as a bio-social 
Science. Lawrence E. Cole. New York: 
World Book Company, 1953. Pp. 884. 
$4.56. . 

A factor analysis oj verbal and non-verbal 
tests of intelligence. Reverend James T. 
Curtin, Washington, D. C.: The Catholic 
University of America Press, 1952. Pp. 
63. $1.25. 

Raising the sights of office management. M. 
J. Dooher, Editor. New York: American 


Management Association, 1953. Pp. 59. 

$1.25, 
Industry enters the atomic age. M. J. 
Dooher, Editor. New York: American 
Pp: 31. 


Management Association, 1953. 
$1.25, 


Guides to meeting tomorrows production 
needs. M. J. Dooher, Editor. New York: 
American Management Association, 1953. 
Pp. 64. $1.25. i 

Planning for worker security and stability. 
M. J. Dooher, Editor. New York: Ameri- 
Can Management Association, 1953. Pp. 
40. $1.25. 

435 


The new climate of union-management rela- 
tions. M. J. Dooher, Editor. New York: 
American Management Association, 1953. 
Pp. 32. $1.25. 

Factors in intelligence and achievement. 
Justin A. Driscoll. Washington, D. C.: 
The Catholic University of America Press, 


1952. Pp. 56. $1.00. 

College board scores. Henry S. Dyer. New 
Jersey: College Entrance Examination 
Board, 1953. $.75. 


Stabilization of employment is good manage- 
ment. Charles C. Gibbons. Kalamazoo: 
W. E. Upjohn Institute for Community Re- 
search, 1953. Pp. 16. Gratis. 

The uneducated. Eli Ginzberg and Douglas 
W. Bray. New York: Columbia Univer- 
sity Press, 1953. Pp. 246. $4.50. 

Psychosis and civilization. Herbert Gold- 
hamer and Andrew Marshall. Glencoe: 
The Free Press, 1953. Pp. 126. $4.00. 

Measurements of human behavior. Edward 
B. Greene. Revised edition. New York: 
The Odyssey Press, Inc., 1953. Pp. 790. 
$4.75. 

A clinical approach to children’s Rorschachs. 
Florence Halpern. New York: Grune and 
Stratton, Inc., 1953. Pp. 288. $6.00. 

Introduction to psychology. Ernest R. Hil- 
gard. New York: Harcourt, Brace and 
Company, 1953. Pp. 659. $7.50. 

Current problems in psychiatric diagnosis. 
Paul H. Hoch and Joseph Zubin. New 
York: Grune & Stratton, 1953, Pp. 291. 
$5.50. 


The psychology of successful selling. Richard 
W. Husband. New York: Harper and 
Brothers, 1953. Pp. 306. $3.95. 

Techniques of successful foremanship. Eugene 
E. Jennings. Madison: University of Wis- 
consin, School of Commerce, Bureau of 
Business Research and Service, 1953. Pp. 
41. $1.15. 

Psychology and alchemy. 
York: Bollingen Found 
Pp. 563. $5.00. 


C. G. Jung. New 
ation, Inc., 1953, 


436 


The psychology and psychotherapy of Otto 
Rank. Fay B. Karpf. New York: Philo- 
sophical Library, 1953. Pp. 129. $3.00. 

Elementary school objectives. Nolan C. 
Kearney. New York: Russell Sage Foun- 
dation, 1953. Pp. 189. $3.00. 

Rehabilitation of the physically handicapped. 
Henry H. Kessler. New York: Columbia 
University Press, 1953. Pp. 275. $4.00. 

Statistical methods in experimentation. No- 
lan C. Lacey. New York: Macmillan Co., 
1953. P. 249. 

The retarded reader in the junior high school. 
May Lazar, Editor. New York: Board of 
Education, Bureau of Educational Re- 
search, 1952. Pp. 126. 

The psychology of personal and social ad- 
justment. Henry Clay Lindgren. New 
York: American Book Company, 1953. 
Pp. 481. $4.50. 

Design and analysis of experiments in psy- 
chology and education. E. F, Lindquist. 
Boston: Houghton Mifflin Co., 1953, Pp. 
393. $6.50. 

_ In the minds of men. Gardner Murphy. 
New York: Basic Books, Inc., 1953. $4.50. 

Rorschach interpretation: advanced tech- 
nique. Leslie Phillips and Joseph G. Smith. 
New York: Grune and Stratton, Inc., 1953. 
Pp. 400. $8.75. 

Wait the withering rain. Austin L. Porter- 
field. Fort Worth: Leo Potishman Foun- 
dation, 1953. Pp. 147. $2.50, 

Social psychology. S. Stansfeld 
New York: The Ronald Press Ci 
1953. Pp. 519. $4.50. 

Occupational information. Carroll L, Shartle. 
New York: Prentice-Hall, Inc., 1952, 
$5.00. 


Sargent. 
‘ompany, 


New Books, Monographs, and Pamphlets 


An experiment in recreation with the men- 
tally retarded. Bertha E. Schlotter and 
Margaret Svendsen. Chicago: Illinois De- 
partment of Public Welfare, 1951. Pp- 
142. Gratis. 

Medical public relations. Edgar A. Schuler, 
Robert J. Mowitz, and Albert J. Mayer. 
New York: Health Information Founda- 
tion, 1952. Pp. 228. 

Groups in harmony and tension. Muzafer 
Sherif and Carolyn W. Sherif. New York: 
Harper & Brothers, 1953. Pp. 316. $3.50: 

Introduction to experimental method. John 
C. Townsend. New York: McGraw-Hill 
Book Co., Inc., 1953. $4.00. 

Modern educational problems. Arthur E; 
Traxler, Editor. Washington, D. C.: 
American Council on Education 1953. PP: 
147. $1.50. 

Improving transition from school to cae? 
Arthur E. Traxler and Agatha Townsen¢- 
New York: Harper & Brothers, 1953. PP: 
165. $2.75. 


Personality tests and assessments. Philip E. 


Vernon. London: Methuen & Co. Ltd: 
1953. Pp, 220, i 
The roots of psychotherapy. Carl A- r 
taker and Thomas P. Malone. New Yous 
The Blakiston Company, Inc., 1953. ad 
236. $4.50. b- 
The measured effectiveness of employee p r- 
lications. Association of National Adv 
tisers, 1953, Pp. 109. Com- 
Drug addiction among adolescents. the 
mittee on Public Health Relations °% 
New York Academy of Medicine. 1955: 
York: The Blakiston Company, Inc-s 
Pp. 320. $4.00, 


a 


j 


Journal of Applied Psychology 


VoL. 37, No. 6 


Drc 


MBER, 1953 


The Prediction of Proficiency of Taxicab Drivers 


Clarence W. Brown and Edwin E. Ghiselli 


University of California, Berkeley 


In the evaluation of devices for use in the 
Selection of operators of public conveyances, 
greatest attention has been given to the cri- 
terion of accidents. In some instances labor 
turnover has been considered, but safety of 
Performance has been given greater emphasis. 
While the importance of accidents and labor 
turnover certainly is not to be minimized, 
it should be apparent that the success of 
Operators of vehicles can be measured in 
Other important ways. In the taxicab indus- 
try, for example, job success can be gauged 
in terms of the dollar volume of business that 
the driver achieves. The economic health of 
a taxicab company can be improved by reduc- 
ing costs due to accidents and personnel re- 
Placements, but it is more directly related to 
the monetary return accomplished from the 
Sale of its services. It is apparent, therefore, 
that the selection of individuals who can sell 
their services as taxicab drivers is worthy of 
Consideration, 

Very little information is available on the 
effectiveness of predicting the productivity of 
taxicab drivers. Wechsler reports inconsistent 
Correlations between sales and intelligence test 
Scores (4). Viteles found intelligence tests 
to be of no value but obtained substantial pre- 
dictions from a weighted personal data blank 

). The results of investigations conducted 
© date provide little help in planning an ex- 
Perimental program for driver selection. 


Criteria 
The amount of business conducted by a 
taxicab company is subject to a number of 
Uncontrolled variables. It is affected by such 
»> vious factors as weather and season. But 
' addition, sales are sensitive to other types 


4: 


3 


3 


of occurrences such as large civic entertain- 
ments, conventions, the payment of bonuses 
by some large local organization, and the like. 
In many instances sales rise or fall for no 
discernible reason. Since these variations 
may be as great as 100% it is apparent that 
corrections must be applied to a driver's sales 
in order to compensate for the time trends 
in the volume of business. In the present in- 
vestigation the average sales for all drivers 
were computed for each week, and the pro- 
ductivity of each driver was expressed as a 
percentage of this average. This procedure 
controlled most but not all of the time trends. 
These weekly indices formed the basis of the 
production criteria employed in the valida- 
tion studies reported here. 

One characteristic of the taxicab industry 
which is pertinent to all selection studies is 
the high rate of turnover among the drivers. 
A person who has been with a company for a 
year is considered to be an “old hand.” This 
high labor turnover means that production 
records for any extensive periods of employ- 
ment are not obtainable for large numbers of 
drivers working under relatively homogeneous 
conditions. In the present investigation pro- 
duction during the first eighteen weeks of em- 
ployment was used. In spite of the fact that 
this period of time is relatively short, the re- 
liability of the measures of proficiency was 
quite satisfactory. The coefficient of correla- 
tion between production indices on odd and 
even weeks, corrected by the Spearman-Brown 
formula, was found to be .96. 

A cross validation study of the tests was 
conducted in a second and smaller company. 
Due to the particular accounting methods of 


this company, sales records were not avail- 


438 


able in a usable form. The manager of the 
company, however, provided ratings of his 
drivers’ productivity on a six-point scale. 
In making the ratings the manager discussed 
each driver with the investigators, thus care- 
fully reviewing the driver’s achievement be- 
fore placing him in one of the rating cate- 
gories. While no evidence of reliability was 
obtained, in terms of distribution statistics, at 
least, the ratings were satisfactory. The rat- 
ings were made on the men after three months 
of employment. 


Subjects 


The subjects in the present investigation 
were men who applied for work as taxicab 
drivers and who were hired. Only those cases 
were used who had no previous experience in 
driving taxicabs. They did vary, however, 
in the amount of experience they had had in 
driving other types of commercial vehicles. 

Various selective factors operated so that 
the subjects used were by no means repre- 
sentative of the entire range of talent of ap- 
plicants. Prior to being hired the men were 
interviewed and took a driver road test. 
About 20% were rejected on these bases. In 
addition, about another 20% were rejected 
on the basis of very poor scores on the apti- 
tude tests to be described here. As a final 
selective factor, only those cases were used 
who remained on the job either 18 or 12 weeks 
or more. In the two companies used in the 
present investigation, approximately 40% of 
the drivers left their jobs within the 18 or 12 
week periods. The men utilized in this study. 
then, represent about 20% of the applicants : 
those who survived the hiring procedures and 
remained on the job at least 18 weeks for the 
larger company and 12 weeks for the smaller 
company. For the basic validation study, 54 
men were drawn from the first company, ahd 
for the cross validation study 29 men ‘rere 
drawn from the second company. 


Predictor Variables 

Seven aptitude tests were 
„with an interest inventory, 
were time limited, and all were of the paper 


and pencil variety. As indicated earlier, the 
tests and the inventory were administered 


utilized together 
All of the tests 


Clarence W. Brown and Edwin E. Ghiselli 


prior to hiring. The choice of the particular 
measures utilized was dictated by an interest 
in predicting several aspects of success rather 
than concentrating on sales alone. Certain of 
the measures were found to predict accidents 
and labor turnover (1, 2). i 
An arithmetic test was employed which in- 
volved problems in making change and com- 
puting fares. A test, termed Speed of Reac- 
tions, presented the individual with a series 
of rules that he was to use in making differ- 
ential responses to various spatial arrange- 
ments and organizations of letters. Some mM- 
dication of motor speed and precision was 
obtained from dotting and tapping tests- 
The dotting test called for the placing of a 
single dot in each of a series of irregularly 
spaced circles. In the tapping test only 
speed was required, the individual tapping, as 
rapidly as possible with his pencil, placing 
three dots in each of a series of circles. 
Two tests of spatial ability were adminis: 
tered which primarily involved the ability tO 
detect differences in distances. In the Juds- 
ment of Distance test each item was a schema- 
tized table top on which rested four cubes ° 
equal size. On the basis of perspective om 
interposition the individual judged pte 
cubes were nearest together. The pistma 
Discrimination test called for the discrimin® 
tion of linear distances between points. , h 
Mechanical Principles test was used WPIC 
consisted of a series of pictorially presi 
problems each of which required knowledë 
of some simple principle of mechanics. 
In the interest inventory each item invo, id- 
a pair of occupations or jobs, and the ais 
ual chose the one of each pair which he Ter 
ferred. The choices were between 4 ne ie 
and a lower Occupation, a job performed ide, 
side as compared with one performed naper 
a job involving dealing with people 1? job 
a not requiring such activity, 2” ne SO 
o ving moving about rather than O 
quiring sedentary activity. 


olve 


Results for 

Table 1 gives the validity coefficient the 
the various predictor variables use soup 
sales production criterion for the basic 8 n of 
of 54 drivers. With the possible except! 


4 


Bc S a 


The Prediction of Proficiency of Taxicab Drivers 


Table 1 


Validity Coefiicients of Several Tests for Predicting 
Sales Production of 54 Taxicab Drivers 


Validity 

Test Coefficient 
Arithmetic 29 
Speed of Reaction —.19 
Dotting 21 
Tapping AS 
Judgment of Distance — 03 
Distance Discrimination -24 
Mechanical Principles AS 
Interest Inventory 20 


the arithmetic test, none of the predictors 
alone would be considered to give adequate 
Prediction. When the extent of restriction in 
Tange of talent is considered, however, low 
Coefficients assume some importance. With 
the exception of the Judgment of Distance 
test, all measures would seem to merit further 
Study, 

A simple combination of test scores was 
effected by eliminating the Judgment of Dis- 
tance test, assigning unit weight to each of 
the others, and assigning a negative value to 
the Speed of Reaction test. In effect, this 
Composite score was the sum of the standard 
Scores of the individual tests. The validity 
of this battery score for the 54 basic cases 
Was .39, which is a reasonably satisfactory 
Prediction, 

As is well known, there is almost always a 
Sirinkage in validity coefficients in cross vali- 
ation studies. The test weights mentioned 
above were used in validating the scores of the 

Cases in the second company. In this cross 


439 


validation the validity of the battery was 
found to be .29. While this value may not 
appear to be particularly significant it is to be 
remembered that in addition to restriction of 
range of talent this coefficient is affected by 
the use of a somewhat different criterion. 

In view of the complex nature of the pro- 
duction criterion, it is surprising that such 
tests as dotting, tapping, and discrimination 
of distances have any predictive power at all. 
No logic would lead an investigator to em- 
ploy such tests in predicting sales of taxicab 
service. To be sure, the extent of prediction 
by individual tests was low, but the com- 


bination of tests gave a usable index of 
aptitude. 
Summary 


Seven tests and an interest inventory were 
administered to 54 taxicab drivers and vali- 
dated against their sales. With one possible 
exception, no single test gave adequate pre- 
diction. A simple weighted combination of 
the tests yielded a validity of .39. When the 
weighted battery was applied to another group 
of 29 drivers it was found to have a validity 
of .29 in the prediction of ratings of job 
proficiency. 


Received March 16, 1953. 


References 


1, Brown, C. W. and Ghiselli, E, E, Prediction of 
labor turnover by aptitude tests. In press, 

2. Ghiselli, E. E. and Brown, C. W. Prediction of 
accidents of taxicab drivers, J, appl. Psychol., 
1949, 33, 540-546. 

3. Viteles, M. S. Industrial psychology. New York: 
Norton, 1932. 

4. Wechsler, D. Tests for taxicab drivers. J, Per- 
son, Res., 1926, 5, 24-30. ` 


Tue JOURNAL oF APPLIED PSYCHOLOGY . 
Vol. 37, No. 6, 1953 


Some Measured Characteristics of Air Force Weather Fore- 
casters and Success in Forecasting ' 


James J. Jenkins 


University of Minnesota 


A review of the psychological and meteoro- 
logical literature reveals that little is known 
about the measured psychological character- 
istics of weather forecasters, and this writer 
has found no studies relating such character- 
istics to occupational success. In the past it 
appears that high scholastic ability or achieve- 
ment has been accepted as essential. Selec- 
tion practice in the AAF Technical Training 
Command during World War II stressed high 
scores on tests of academic ability, mathe- 
matics, and physics (e.g. 10, 11, 12) since the 
Weather Forecasting course was believed to 
be one of the most difficult courses offered in 
the technical training schools. Success in the 
course showed low positive correlations with 
the AGCT and mathematics tests (e.g. 8, 9). 
Harrell (2) in a survey of AGCT scores of 
209 AAF technical specialties found the en- 
listed weather forecasters to be the highest 
ranking group with a median score of 136.7. 
This perhaps indicates only that the screen- 
ing on intelligence was very effective, 

The purpose of the present study was: (1) 
to determine how Air Force forecasters are 
differentiated from a more general population; 
and (2) to disclose the extent to which cer- 
tain measures are associated with ability to 
forecast weather. 


Procedure 


In 1948 the writer secured the cooperation 
of the Air Weather Service for a study of 
some of the psychological characteristics of 
forecasters and the possible relation of these 


! This study was made possible by ati 
of the Air Weather Serva U.S. Ar Benet 
undertaken with the encouragement of General DN 
Yates, then Chief of the Air Weather Service. The 
writer is especially indebted to Prof. Donald C. 
Paterson for his assistance and guidance in every 
phase of the study. This paper is part of the writ nh 
Ph.D. thesis on file in the library of the University 
of Minnesota under the title of “Prediction of ae 
fasting efficiency for Army weather forecasters.” 


characteristics to success in forecasting. A 
study of available job descriptions and lists 
of qualifications (e.g. 13, 14) and a job analy- 
sis from these sources and the writer’s OW? 
experience as a forecaster resulted in the selec- 
tion of the following variables for considera- 
tion as related to success: education, college 
major, mathematics background, forecasting 
and observing experience, kind of meteoro- 
logical training, forecasting aids most fre- 
quently used, speed and accuracy of percep- 
tion, spatial relations ability, general academic 
ability, and vocational interests. Information 
on all but the last four of these was gatherct 
by means of a questionnaire. The remains 
variables were measured by the Minnesota 
Clerical Test, the Revised Minnesota Paper 
Form Board, the Ohio State University a 
chological Test, and the Strong Vocational In 
terest Blank for Men. The tests were acest 
istered to the forecasters by the Air Weathe 
Service and the results returned to the write 


The Criterion 


The problem of obtaining criterion data n 
been encountered by the Air Weather Servic? 
early in World War II. Muller (5) in & "Y 
view of the literature on verification of et 
casts points out that no less than 54 metho 
of evaluation were proposed betwee? 
and 1943 and that all of these have m 
vigorously criticised. After a long progra- 
of experimentation by the Weather rior 
tion Branch, a special verification method 
devised by Lt. M. J. Slonim (15) Wiles 
Seemed to avoid most of the usual difficult he 
This procedure consisted of evaluating fot 
Probabilities of occurrence of given vale ft 
each forecast element (pressure, tempe"? rom 
Precipitation, visibility, and ceiling) tion 
climatological data for the time and re 
being forecast. A scale of 30 equal P 
ability units (or trentiles) was set * 
which observed and forecast values c0" 


140), 


Characteristics of Air Force Weather Forecasters 


compared. Discrepancies were summed in 
Probability units to indicate the relative ac- 
curacy of forecasts. For example, to score 
Pressure forecasts for a given station at a 
given time, we would proceed as follows: (1) 
gather past observations for this period of the 
year for this locality and make a frequency 
distribution of observed pressures; (2) divide 
this frequency distribution into 30 intervals 
of equal or nearly equal probability of oc- 
currence (see Table 1 for a partial example); 
and (3) score forecasts now obtained in terms 
of the number of equal probability intervals 
(or trentiles) in the discrepancy between the 
forecast and observed values. (In our ex- 
ample, if a pressure of 995 is observed and a 
Pressure of 1000.8 is forecast, the score is 
zero. If the forecast is 1003.0 the score is 
One. If the forecast is 1033.0 the score is 
thirty, etc.) 

Each Air Force forecaster in the United 
States was required to make at least three 
forecasts per week for five widely scattered 


Table 1 


A Hypothetical Frequency Distribution of Barometric 
Pressures and the Resulting Trentile Table 
for Station “X” for a Given 
Thirty-Day Period 
ene 


Pressure Distributii Trentile Table 
Ssure Distribution ren 
ee 


Pressure Numt f 
‘a Mili- nE Values of nilie 
: i Element ren 
ans tions Elemen 
998.7 1 } 
990.4 1 | 
1000,3 i \ 
100.8 1 Less than and 
O12 1 including 1001.2 1 
1001.7 1 
1003.1 i l 
1003.7 1 From 1001.3 to 
1003.9 3 1003.9 inclusive 2 
Pie etc. 
1032.9 7 
1033.5 1 | From 1031.8 to 
1033.9 2 | any greater value 30 


bete: The middle portion of this table is omitted 
“Se of excessive length. 


441 


stations selected by the Weather Information 
Branch. All forecasts were made from the 
1230 Greenwich time maps. The data avail- 
able to the forecasters were approximately 
the same regardless of their location in the 
country. 

This program ran from 1943 to 1945 and 
furnished the criterion data for this “post- 
diction” study. The criterion yielded a re- 
liability of .90 (estimated from a part-whole 
correlation of .70 between 8 weeks of the pro- 
gram and the total 84-week program). It is 
unfortunate that these data are now available 
only in terms of standard scores so that the 
relative accuracy of the forecasts in terms of 
initial values is unknown. The homogeneity 
or heterogeneity of the forecasters as a group 
is impossible to assay even in terms of prob- 
ability deviations. It is also regrettable that 
the conditions under which the forecasts were 
made (time pressure, amount of other work, 
freedom from interference, etc.) could not 
possibly be equated. The validity evidence is 
largely “face validity” (15). 


The Sample 


The sample for this study was sharply 


> restricted by three conditions. First, the fore- 


caster must have participated in the criterion 
study. Second, he must have remained in the 
Air Weather Service until 1948. Third, in 
1948 he must have been stationed in or near 
the United States so that he could be tested. 
Only 92 forecasters met all these conditions 
and constituted the sample for this study. 
The forecasting scores of the sample were 
compared to those of the total group par- 
ticipating in the verification program (N 
= 2023) and were found to resemble them 
closely (x* = 2,832; P = .88). 


Characteristics of the Sample 


General. All but two of the forecasters 
graduated from high school, but only 30 had 
graduated from college. Three had Ph.D. de- 
grees and 12 others had done post-graduate 
work. Average education was 14.3 years with 
a standard deviation of 2.2 years. A total of 
57 indicated college majors and of these 49 
were in the natural sciences or mathematics, 
Before starting meteorology training the aver- 


442 


age number of college mathematics courses 
was three. The range was from no courses 
to two Ph.D.’s in mathematics. All of the 
sample had received training in meteorolog: 
in military schools under contract with the 
Air Forces. Of the sample, 71 per cent had 
previous experience as weather observers. 

Test Data. Means and standard deviations 
in raw scores for the ability tests are given in 
Table 2. Reference to relevant norm groups 
reveals the forecasters to be a highly selected 
group on all of these variables. 


Table 2 


Means and Standard Deviations of Forecasters 
on Ability Tests 


Standard 
Tests Mean Deviation 
Minnesota Clerical Test A 
Numbers 141.1 29.8 
Names 145.8 31.9 
Revised Minnesota Paper 
Form Board 50.1 7.5 
Ohio State University Psy- 
chological Exam. 
Part I 24.5 34 
Part II 44.6 10.2 
Part III 47.2 6.8 
Total 116.3 18.1 


The mean scores on the Minnesota Clerical 
Test fall at the 95th and 93rd percentiles for 
Numbers and Names sections respectively 
when compared to gainfully occupied adults 
and at the 60th and 73rd percentiles when 
compared to employed clerical workers them- 
selves (1). 

The mean score on the revised Minnesota 
Paper Form Board when compared to the 
norms of various male industrial groups (3) 
falls from the 80th to the 97th percentile with 
a median value at the 90th percentile. Even 
compared to first and fifth year engineering 
students the percentile ranks are 80 and 70 
respectively. 

For the Ohio State Universit 
Test the forecasters were com 
lege freshmen as a group wh 
proximated the pre-army ed 
of the sample. On this bas 


y Psychological 
pared with col- 
ich roughly ap- 
ucational status 
is the mean for 


Means, Standard Deviations, and Percentage of A and 
B+ Ratings for Each Strong Key for Total 


James J. Jenkins 


Table 3 


Sample of 92 Weather Forecasters 


Stand 
ard 
Devia- 
Group Occupation Mean tion 
I Artist 19.8 10.0 
Psychologist 19.4 12.4 
Architect 28.3 10.5 
Physician 28.7 9.7 
Osteopath 32.2 9.7 
Dentist 27.9 99 
IT Mathematician 25.5 9.9 
Physicist 22.7 13.8 
Engineer 40.8 10.1 
Chemist 38.2 11.5 
II Production Manager 42.6 8.2 
IV Farmer 40.3 9.2 
Aviator 43.2 10.2 
Carpenter a7 10.9 
Printer 39.7 8.9 
Math. Phys. Sci. T. 43.3 9.7 
Policeman 36.9 8.7 
Forest Service Man 35.1 9.5 
V YMCA Phys. Director 30.5 9.6 
Personnel Director 37.0 10.7 
Public Admin. 43.3 8.7 
YMCA Secretary 22.7 10.1 
Soc. Sci. H. S. T. 31.1 10.5 
City School Supt. 23.4 9.1 
Minister 18.5 11.1 
VI Musician 28.5 10.9 
VII C. P.A. 26.0 8.8 
VIII Accountant 36.0 10.6 
Office Man 37.8 9.6 
Purchasing Agent 34.2 8.9 
Banker 27.5 8.6 
Mortician 27.0 8.8 
IX Sales Manager 28.1 9.4 
Real Est. Sales, 31.2 6.6 
Life Insur. Sales, 22.1 10.5 
X Advertising Man 26.8 7.9 
Lawyer 25.8 8.1 
Author-Journalist 25.6 7.5 
XI Pres.-Mfg. Concern 29.8 7.5 
Interest Maturity 55.2 5.5 
Occupational Level 52.6 64 
Masc.-Fem. 54.6 8.4 


un o 


= awa 


AXA 


H> 


4 
/ 


Characteristics of Air Force Weather Forecasters 


the forecasters falls at the 87th percentile in 
Part I (Same-opposites), Sist in Part II 
(Word relationships), 90th in Part III (Read- 
ing comprehension), and 87th for the total 
score (7). 

It is readily apparent that the forecasters 
are a superior group on all three of these 
tests. While this superiority might be ex- 
pected on the Ohio in view of the initial 
Screening, their superiority on the other two 
tests is not readily explained. 

The means and standard deviations for 
€ach of the Strong Vocational Interest Blank 
keys are given in Table 3 in terms of the oc- 
Cupational standard scores. The high mean 
Scores of the meteorologists (B+ and B) 
show their interests to be similar to those 
of persons in the occupations of Engineer, 
Chemist, Production Manager, Farmer, Avia- 
tor, Printer, Mathematics-Physical Science 
Teacher, Policeman, Forest Service Man, Per- 
Sonnel Director, Public Administrator, Ac- 
countant, Office Man, and Purchasing Agent. 

heir interests are most markedly dissimilar 
(scores in the C area) to those persons in the 
Occupations of Actor, Banker, Mortician, Real 

State Salesman, Life Insurance Salesman, 
Advertising Man, Lawyer and Author-Jour- 
Nalist. Most of the rest of the scores were in 
or near the chance range. 

If one views these occupations in terms of 
Strong’s factor analysis data (6), the simi- 
larity of this grouping of the occupations to 
“actor III (“Language” or “things versus 
People”) is immediately apparent. All of the 
Occupations in which the meteorologists score 
igh have positive loadings on this factor Gn 

€ direction of “non-language” and “things”) 
and all the occupations in which they score 
°w have negative loadings (in the direction 
of “language” and “people”). , 

. The picture of the forecasters seen in the 
Mterest test results is one of a technical, 
Skilled-trades interest group with little verbal- 
guistic, pure science, or social service in- 
“test. The relatively low OL score received 
Y the group seems to reflect the technical 
‘killed-trades kind of picture already given. 


443 


The Prediction of Forecasting Ability 


In order that the findings might be sub- 
jected to cross-validation the sample was split 
in half. Individuals were paired on criterion 
scores and for each pair a random determina- 
tion was made as to which member fell in the 
first group and which into the second group. 
A double cross-validation technique (4) was 
used, prediction devices being prepared on 
each group and validated on the other group. 
Regression, cutting scores, and profile tech- 
niques were utilized. The final results of 
this procedure are summarized here rather 
than presenting the work in detail, 

No consistent differences were found be- 
tween the better and poorer forecasters with 
respect to age, rank, education, college major, 
mathematics background, forecasting and ob- 
serving experience, kind of training, forecast- 
ing aids most frequently used, interest test 
profiles or scores on the Revised Minnesota 
Paper Form Board. The Ohio State Univer- 
sity Psychological Test proved to be of little 
use in discriminating degrees of success in 
forecasting but of the 27 persons who scored 
low (below 44) on Part III (reading com- 
prehension), 19 of them fell in the lower half 
of the forecasting group. High scorers were 
not, however, distinguished from other fore- 
casters. 

The Numbers section of the Minnesota 
Clerical Test proved to be of no value in the 
prediction of forecasting accuracy but the 
Names section proved to be of considerable 
value. In both halves of the sample it cor- 
related + .31 with the criterion. When used 
alone with a cutting score it consistently 
eliminated at least twice as many cases from 
the lower half of the criterion group as from 
the upper half. When used in profile rela- 
tionship with the Numbers section of the test 
and the Revised Minnesota Paper Form 
Board, it eliminated 35 per cent of the cases 
in the lower half and only 4 per cent of the 
cases in the upper half. (This amounts to a 
crude use of the other two tests as suppressor 
variables. They correlate positively with the 
Names section and essentially zero with the 
criterion.) 


444 James J. Jenkins 

A double cross-validation study to predict 
forecasting accuracy revealed only one con- 
sistent predictor, the Names section of the 
Minnesota Clerical Test, which correlated 
+ .31 with skill in forecasting. 

It is suggested that further studies of the 


Discussion 
In view of the findings of this research it 
would seem that the role of speed and ac- 
curacy in perception as measured by some 
component of the Names section of the Min- 


nesota Clerical Test should be investigated 
carefully in future studies of weather fore- 
caster training and success on the job. The 
high level of clerical ability found in the 
forecasters as a group seems to argue that 
some kind of selection on this variable is 
already taking place, and the correlation with 
forecast verification seems to indicate that 
this is an important though not an a priori 
obvious source of. variation in on-the-job 
performance. 

It should be noted, however, that the search 
for predictors cannot be considered at all 
complete. Both of the tests which proved to 
have any predictive efficiency in this study 
functioned only at the lower score levels to 
provide negative selection. A study of other 
abilities and the motivational and personality 
characteristics of those individuals who were 
high on all of the tests employed here but 
still relatively low in forecasting accuracy is 
obviously necessary. 


Summary 


A sample of 92 Air Force Weather Fore- 
casters was studied to determine: (1) how the 
sample was differentiated from a more gen- 


role of perceptual skills and personality varta- 
bles in weather forecasting are needed. 


Received February 6, 1953. 


uw 


~ 


- Harrell, T. W. 


. Mosier, C. I. 


- Muller, R. H. 


. Strong, E. K., Jr. 


. Toops, H. 


- Air Forces Technical Schools Validation * 


References 


Andrew, Dorothy M. and Paterson, D. G. Min- 
nesota Clerical Test Manual. New York: Psy- 
chological Corporation, 1946. 

Army General 

results for Air Forces specialists. 

chol. Mcasmt., 1946, 6, 341-349. 


ication Test 
Educ. PSY- 


. Likert, R. and Quasha, W. H. The Revised Min- 


nesota Paper Form Board Test Manual. New 
York: Psychological Corporation, 1948. 
Problems and designs of cross- 
validation, Educ. psychol. Measmt., 1951, 11. 
5-11. 

Verifications of short-range weathe! 
forecasts—a survey of the literature. I, 
and III, Bull. Amer, Meteorological 5° 
1944, 25, 18-27; 47-53; 88-95. 

Vocational interests of meen 
Stanford: Stanford University 


and women, 
Press, 1943. 3 
The Ohio 
Chicago: 


| Manual of Directions; 
State University Psychological Test. 
Science Research Associates, 1941. 


AAF Technical Training Command. 9-19-42: 


- Air Forces Technical Schools Validation Study: 


AAF Technical Training Command. 9-30-42: 


å p = an P oe n Enol 
eral population; and (2) the extent to which 10. ee “9 eg gee aa 
biographical data and psychological meas- tie Uheneane, ogg echnicé 

3 A Js ona b and. e, 1942. > : 
ures were associated with ability to forecast. 11. Classification Division Bulletin. No. 13+ Knol’ 
Subjects completed a questionnaire and four Wood Field, N, C., AAF, Hq., Technical Tre 
standard psychological tests. The sample 12 ing Command. Sept., 1942. znoll- 
proved to be similar to the World War II + Cites feation Division Bulletin. Ne me ree 
population of forecasters with respect to fore- ing fees E BAT, Bis Technica i 
i z ahili and. ec., é. kie 
casting ability aS measured by the Short- !3- Meteorology as a Profession, Vocational Booklet 
Range Forecast Verification Program. No. 4, National Roster of Scientific and tes 

The sample proved to be a highly select eee Personnel. Washington: United 9 
i ne overnment intine Ace. 1947: a} 

group with respect to educational background, 14. Physical thee aon a professioni 
clerical ability, spatial relations ability, and Seties Eamon rd e ril Roster : 
general academic ability. With respect to in- Scientific and Specialized Personnel. wine 
terests the forecasters appear to resemble a oo United States Government 

r Ri E . e ce, 1947 „anl 
eam trades interest group with 15, Short-Range emn Verification prose 
ittle verbal-linguistic, pure science, or social £ J of 


service interests. 


Technical Report 105-26. Publications AAF 


Weather Information Branch. Hade 
1943, 


ah? i 


Tur Journat or Apprirp PsycnoLocy 
Vol. 37, No. 6, 1953 


A Note on Small Samples 


Edward N. Hay 
Edward N. Hay & Associates, Inc., Philadelphia, Pa. 


The sample presents more delicate prob- 
lems than almost any other aspect of meas- 
urement. To begin with, many psychologists 
Working on problems of testing, have been in 
the habit of going through many refined 
Operations to correct for the deficiencies of a 
“small sample.” Small sample methods were 
developed in agronomy, where it is possible 
to hold most of the variables reasonably con- 
stant. This is much less true in testing hu- 
man beings. Consequently, the mere applica- 
tion of small sample statistics cannot be 
expected to produce automatically more valid 
results than would be the case without them. 
Sometimes the characteristics of a sample 
are such that no amount of treatment will 

ting about a satisfactory result. 2 

The ceaseless search for “large samples 
has resulted in many errors. A psychologist 
Not long ago, in the course of an industry 
Study, published norms which were the re- 
Sult of adding tozether a great many small 
Samples. It was not possible to make any 
Check on the soundness of this operation be- 
Cause of lack of information. However, large 
Samples have frequently been made out of 
Stoups of small samples, when the charac- 
teristics of individual small samples differed 
Widely. In one instance, the mean of one 
Sample was more than one sigma away from 
the mean of another sample. The reasons 
Or the difference could only be conjectured 
ut certainly one larger sample was not to be 


445 


had by adding two small samples which dif- 
fered this much. 

Some years ago, I violated my own prin- 
ciples by assembling a group of 120 subjects 
from more than 20 departments of a single 
company. The resulting “hash” actually pro- 
duced reasonably satisfactory validity co- 
efficients just by luck. On another occasion, 
there were three departments of 10, 7, and 24 
employees respectively. Before adding them 
together to make a “large” sample, an ex- 
amination of the characteristics of all three 
groups was made. This revealed that there 
was little variability in the test scores for the 
two smaller groups. In the circumstances, 
neither one could produce a validity coeff- 
cient or contribute to one when added to other 
samples. The larger group gave r’s exceed- 
ing .5 on a number of tests. 

Bransford has developed procedures for 
handling the criterion measures so as to be 
able to combine samples which are unlike in 
some degree; for example, ratings made by 
different raters. He spoke on this topic be- 
fore the Eastern Psychological Association at 
Atlantic City in 1952 under the descriptive 
title “Summational Within-Group Analysis.” 
Combining criterion measures presents more 
difficulties usually than combining the scores 
of tests of the same groups. 


Received September 17, 1953. 
Early publication, 


Tue JOURNAL or APPLIED PSYCHOLOGY 
Vol. 37, No. 6, 1953 


The Measurement of Personality and Behavior Disorders by the 
I. P. A. T. Music Preference Test 


Raymond B. Cattell and Jean C. Anderson 


Laboratory of Personality Assessment and Group Behavior, University of Illinois 


On the wide research front which is roughly 
designated by “projective tests,” but perhaps 
more accurately by “misperception tests” 
(1), few recent advances have been so promis- 
ing as that connected with music perception. 
The powerful and immediate connection of 
musical stimulation with emotional experi- 
ence, and the many indications that uncon- 
scious needs gain satisfaction through this 
medium, have long pointed to measures of 
musical preference as effective avenues to 
deeper aspects of personality. Moreover, the 
lack of verbal content is itself, on general 
principles, a promise that the verbal, cogni- 
tive defenses of the censor may be by-passed 
and the emotional needs probed more directly 
without distortion by defense elaborations. 


The Music Preference Test 


Personality tests which proceed from the 
esthetic reactions of the subject, or from lik- 
ings and dislikings which cannot be based on 


logical, explicit relationships to the subject’s . 


purposes and sentiments, occupy an area in- 
termediate between that of projective tests 
and that of other objective personality tests, 
For the liking or disliking is evidently due to 
characteristics imported or projected into the 
physical sounds by the listener, yet the “Dro- 
jections” are not so explicit as in the imagery 
evoked by the Rorschach or the interpretive 
stories which the subject is asked to weave 
around the T.A.T. It is possible, therefore, 
that further research and clinical experience 
with this relatively unexplored class of tests 
(which may be called tests of “affective mis- 
perception”) will show them to have certain 
advantages over the standard Projective or 
misperception tests. For sophisticated sub- 
jects intuitively realize that their Cognitive 
projections stand in need of defensive dis- 
guise, whereas their likings and dislikings 
make no more sense to them than they do to 


the psychologist—before his statistical analy- 
ses are made. ii 
As in all test construction involving “items, 
it would be foolish here to design psycho- 
logical measures hinging on the luck of a 
single response and to attempt to relate such 
a single response to personality dimensions. 
Instead we first seek reliability in the test 
measurement itself by composing it of scores 
on several items, thereby diminishing the 
effects of chance and specific historical as- 
sociations. This may be conceived as dis- 
covering the dozen or more “items” that can 
be validly added together to give a score 0” 
some single dimension of emotional quality 
or musical-emotional reactivity. Attempts t° 
find these groupings by introspection ot : 
Psychiatric judgments must be set aside, i 
they are shown by preliminary research to ° 
highly unreliable and to constitute an a 
teurish approach to the problem. Instead : 
is necessary to find the dimensions of musica 
choice by submitting a number of a 
excerpts to a large population and corre A 
ing the responses, thereby discovering ghis 
pirically which responses “go together.” dy 
first stage of research in the area has alre@ 
been carried out by Cattell and Saunders a 
using 120 half-minute musical excerpts Y” 
conditions described elsewhere. d re 
The psychologically interesting and o 
assuring thing about this factor analysis nse 
a matrix couched in a new variety of one 
correlations, namely, in music preference 
Sponses, is that simple structure WS tS) 
definitely obtained here as with ability ee 
and that a comparison of two factorizat j 
revealed a very gratifying degree 0 were 
ance of the factors. With this assurance ©. 
an initial Study it is to be hoped that Mast 
chologists will be encouraged to face DE ap” 
amount of exacting work required by t rely 
Proach instead of being beguiled by m 
esthetic intuitions in test construction. 


446 


Measurement of Personality and Behavior Disorders 447 


The two dovetailed factor analyses yielded 
eleven stable factors (4). But before these 
basic findings could become a practical 
foundation upon which further “applied” re- 
search could readily go forward, in clinics 
and guidance centers generally, it was first 
necessary to construct out of the above re- 
search findings a convenient routine instru- 
ment. This was done under the auspices of 
the Institute for Personality and Ability Test- 
ing by the senior author and has issued in a 
12-inch long-playing record, reproducing 100 
half-minute music excerpts (50 on one side, 
Form A, and an equivalent 50 on the other, 
Form B). Except for the first and the last 
three factors in this test there are ten items 
Provided to measure each factor. These items 
Were chosen from the 120 factorized, accord- 
ing to the usual test construction principles; 
a significant loading on the factor concerned; 
a balancing (suppression) of loadings on fac- 
tors not concerned: a balancing of “like” and 
“dislike” responses in the score for any one 
factor; no use of any item for more than one 
factor, A cyclical order of sampling of items 
from the various factors is used in the test as 
finally presented. : 

The test so constructed, when cross vali- 
dated on a new population, was found to have 
Consistency (split-half reliability) and equiva- 
lence (Form A vs. Form B) reliability co- 
efficients (2) that were adequate on only 
Seven or eight of the eleven factors. See Table 

This inadequacy arises largely from some 
factors being measured on a bare minimum 
of 3 or 4 items in one form. Accordingly it 
'S advocated that only seven or eight inde- 
Pendent factors be routinely measured in 
Standard clinical use and that the remaining 

Tee or four measures serve an exploratory 
Purpose, as “located nuclei” from which fur- 

7 research can, by extension into new items, 
wild u actor scales. 

n the test has been initially stand- 
ardized for every factor on a normal popula- 
tion of 380 student and non-student adults 
tanging from 18 to 68 years of age. The in- 
structions, which are given in standard form 

Y the voice on the record, are set out below. 
he LP.A.T. Music Preference Test of Per- 
Sonality (3) is thus normally presented simply 


as a test of musical preferences, but the im- 
plication that we were psychologically inter- 
ested in the results from the standpoint of 
personality measurement was realized, at least 
by the normal group, in this particular ex- 
periment. 


First Issues Needing Research 


Now that such a measuring instrument is 
available, a number of researches immediately 
suggest themselves, especially in applied psy- 
chology. Concerning its promise as a per- 
sonality test it is at once apparent from in- 
spection of the actual musical excerpts found 
to be highly loaded in the various factors, 
that these factors are not merely cultur- 
ally-determined groupings, corresponding to 
musical “schools” or periods (with one possi- 
ble exception among the eleven factors: F 1). 
With this superficial interpretation rejected 
we may next examine the hypothesis that 
these factors correspond to what have been 
called major “hidden premises” in the logic 
of personal preference (1). For these hidden 
premises of choice decision, according to our 
hypothesis as stated elsewhere (1), should be 
temperamental and early-environment-deter- 
mined dimensions of personality itself, 

If this is correct, there should be some sub- 
stantial correlations between these factors and 
the factors on the 16 Personality Factor Ques- 
tionnaire or any other measure of the primary 
personality factors. This at least is the 
hypothesis upon which the whole of the pres- 
ent investigation has been carried forward. 
If the musical choices are determined by per- 
sonality factors, i.e., by emotional needs and 
constitutional tempers, we should expect, 
further, that various neurotic and psychotic 
syndromes, which are themselves explicable 
in terms of combinations of personality fac- 
tors, and sometimes in terms of single per- 
sonality factors, should show correlations 
with the musical choices. The immediately 
needed investigations, therefore, seem to be: 
(1) a study correlating the music factors 
with primary personality factors, in a normal 
group; and (2) a comparison of psychotics 
and normals in terms of musical preference 
factor profiles. 

The hypotheses that the music factors cor- 


448 


respond to needs or to temperamental factors 
can be tested by this design, but one should 
also recognize that a third possibility exists 
—namely that the discovered music factors 
represent affective mood states, temporary 
dynamic stimulus conditions, physiological in- 
fluences, etc. This alternative, however, need 
not be investigated unless the present search 
for stable personality associates proves abor- 
tive. Some “function fluctuation” associated 
with mood will almost certainly exist and it 
will attenuate our correlations. But if our 
hypothesis is correct that the major associa- 
tions will be found in relation to relatively 
stable personality structures, then it could 
seem better to track down this residual, “fluc- 
tuation” variance later. At that point not 
only the associations of the music factors with 
mood, but also the individual tendencies to 
high or low fluctuation on the music factors 
will bring in relationships of further impor- 
tance for understanding musical preference 
and personality. 

A fourth design of research which is also 
immediately needed is a factorization of a 
population of psychotics, to see whether the 
structure of factors is the same there as ina 
normal group. Unless there is some fairly 
close resemblance of the factor structure in 
the two groups, it would indeed be illogical 
to measure psychotics on the same dimensions 
as those found among normals, Accordingly 
we have also gathered data for factorization 
of the same 120 excerpts on a Population of 
100 psychotics, and this will be intercorre- 
lated and factorized if statistical man-hour 
resources can be provided by the Music Re- 
search Foundation. 

The general reaction of cultivated listeners 
to the above propositions has been that our 
hypotheses neglect the role of intellectual and 
cognitive functions in musical appreciation 
Our argument is that these functions are not 
primary but are only means to ends—tech- 
nical rationalizations of the aesthete, perha 
changing superficially with cultural tical 
for satisfactions which are deeper and m te 
stable. Initial experimental support for se 
position is given by the fact that the music roe 
tors do not apparently correspond in content S 


Raymond B. Cattell and Jean C. Anderson 


cultural or technical dimensions. A research 
designed to tackle this question more posi- 
tively has meanwhile been set in motion. It 
consists of an experiment in which fifty 
choices in pictorial art, thirty choices In 
architecture, and forty choices in sculpture 
are intercorrelated and also correlated with 
the factors in musical choice. If the same 
factorial dimensions appear here, aligning 
themselves with the music factors, and cutting 
across periods and cultural integrations, there 
will be additional evidence that we are pro- 
ceeding beyond technical, cultural or his- 
torical patterns. 


Design of the Experiment 


The first part of our investigation, that 
with normal subjects, called for the adminis- 
tration of the Musical Preference Test to a 
normal population which should be : (1) well 
varied in personality; and (2) simultaneously 
measured on a sufficiently reliable and valid 
measure of the primary personality factors: 
The main contribution to the test population 
consisted of 102 male and female subjects: 
76 of whom were University of Illinois stu- 
dents, ages 18 to 29, and 26 of whom were 
“general adults,” ages 30 to 81. The re- 
mainder were tested in a second sub-grouP 
consisting of 55 students, both male and fe- 
male, ages 17 to 28. Since we needed tO 
apply a personality test which deals with 
primary and independent personality dime? 
sions of known associations we employed os 
LEAT 16 Personality Factor Questionnail® 
which is also convenient for group adminis 
tration with reasonably literate population 
The 16 P.F. includes intelligence as 0M¢ ee 
mension. Each of the 157 subjects, wee 
fore, took a one-hour music preference test 
which both forms A and B of the music Fr 
were administered, and a half-hour ed 
Session in which Form A of the 16 P.F. T 
Was administered. The instructions i” 
Music Preference Test are on the beginning 
the record, and are as follows: 


a jinge 22 
“This is a test of your likings and disliking w 
music. Your score has nothing to do wit tes: 
much you agree or disagree with popular 


but only with how much you agree with yO 


A 


— 


FY 


Measurement of Personality and Behavior Disorders 449 


that is, with how consistent ? you are. So try to 
say, as each piece is played, whether you your- 
self like it; whether it is pleasant. so that you 
would like to hear more of it, or whether you 
would just as soon have it switched off. 

“On the score sheet before you are numbers 
for the fifty pieces that will be played, each for 
less than half a minute. As each comes to an 
end, underline L, I, or D, opposite that number, 
indicating you like it, or have an intermediate, 
indifferent ‘reaction, or dislike it. Dislike does 
not mean that you hate it. but only that you 
don’t particularly like that kind of music. In 
fact you should aim to have just as many D's as 
L’s underlined when you get to the end. Try 
Not to use I for intermediate more than you need. 
n fact, you should expect to end up with very 
roughly one-third L's. one-third I’s and one-third 
D's. But don't bother about that too much. 


ust give your reactions as truthfully as pos- 
sible... 0 


The administration of the Music Preference 
Test to a grovp of psychotics took place at 
Kankakee State Hospital, Kankakee, Illinois. 
In this case the subjects were taken in small 
Sroups of three or four at a time, in order that 
it might be ascertained that they were appro- 
Priately responding on the answer sheets to 
every piece of music. It is well known that 
diagnoses in different mental hospitals do not 
agree very highly (as shown on the individual 
Cases transferred from hospital to hospital), 
and that the very proportions of manic-de- 
Pressives, schizophrenics, hysterics, and other 
Psychotic syndromes, as diagnosed in different 
institutions, may vary considerably. As usual 
a good deal of difficulty was experienced in 
obtaining a sufficient sample of some psychia- 
tric syndrome groups. In accepting the group 

‘visions finally used the criterion for classi- 
‘cation was naturally the hospital diagnosis 
aS reached in case conferences. A total group 
of 98 Psychotic patients was obtained consist- 
ig of 36 alcoholics, 22 schizophrenics of 
Mixed types, 10 manics, 7 paranoids, and 23 
ar Other categories each not sufficient in num- 
Pet for separate use in our study. The sub- 
Jects were both male and female, the age 
"ange being approximately 25 to 60 years. 
tin Ts, obviously ake the person tobe th 

and to give his consid e Er ye 


istent with regard to 
ts were not music 


Results for the Normal Personalities 


The findings for the normal group will first 
be described. Our initial interest turns on the 
reliabilities, a minority of which, as men- 
tioned above, were low enough to suggest 
dropping certain factors. These correlations 
are presented first as consistency (split-half) 
coefficients in Table 1, Part A and secondly 
as coefficients of equivalence (correlation of 
Form A with Form B) in Part B of Table 1. 

The equivalence coefficients perhaps do not 
do justice to the tests because the highest 
loaded items were in every case put in the 
A form, since, when psychometrists are un- 
able to use the full length test, it is the A 
form that they will use. This reduces the 
equivalences (columns 5 and 6) below the 
consistency coefficients (columns 2 and 4) 
which more truly represent the internal con- 
sistency, and are defective—for a 10-item 
length of scale—only on factors 3, 9 and 10, 
recommended to be dropped. 

The correlations between the sixteen factors 
of the 76 P. F. Test and the eleven factors of 
the Music Preference Test were worked out 
separately for the two populations, as a mu- 
tual check. For economy of representation 
the values in Table 2 are blanks except where 
the correlations on the two samples are of the 
same sign and both beyond the 1% level of 
significance. Then a single value—the mean 
correlation (Fisher's z)—has been corrected 
for attenuation, by the given reliabilities of 
the Music Preference and 16 P. F. Test meas- 
ures, and recorded in Table 2. 

None of the correlations is large enough to 
demonstrate a one-to-one relation between the 
music factors and the personality factors, 
But the set of 16 P. F. Test factors associated 
with any one music factor has a psycho- 
logically consistent and compatible character 
among the members in every case. For ex- 
ample, the personality factors correlating sig- 
nificantly with music factor No. 1 are domi- 
nance, surgency, toughness, radicalism and 
self-sufficiency—all possibly related to some 
second-order, comprehensive factor of tem- 
peramental toughness. Furthermore (and al- 
ternatively) the relative magnitudes of the 
correlations are such as could be compatible 


450 Raymond B. Cattell and Jean C. Anderson 


Table 1 


Reliability Coefficients for Factor Measurements 


Part A 
Consistency Coefficients 
(Whole Group) 


Part B 
Equivalence Coefticients 
(Form A with Form B) 


Half- No. of Spearman-Brown Sample Sample 
Length Items Corrected to of 102 of 71 
Factor Coefficient in Half Full Length Persons Persons 
1 (71) 5 83 a5 64 
2 (.62) 5 7 A2 57 
3 (.06) 5 .11 (used only —.10 .24 
experimentally) 
4 (41) 5 59 02 19 
5 (.10) 5 -18 (used only 11 39 
experimentally) 
6 (,27) 5 43 38 27 
7 (Al) 5 58 15 11 
8 (.46) $ .63 38 26 
9 (.00) 4 -00 (used only 16 —.01 
experimentally) 
10 (14) 3 25 (used only 04 1 
experimentally) 
11 (37) 3 55 i 28 31 


with a one-to-one relationship of music and 
personality factors if chance experimental 
error and the existing specious correlations 
among the factors within both the personality 
and the music area could be eliminated (nota- 


bly by longer scales for each factor and bY 
dropping items in one factor scale having a®Y 
correlation with another factor). A test of 
this possible explanation must await muc 
further work on the purification of the pres- 


Table 2 


Correlations of Music Preference Factors and Personality Factors 


16 P.F. B 
Factors Music Preference Factors 

1 2 3 4 5 6 mi 
n g caw 2 7 8 9 10 35 
à = ee a a S k 

B = 2 aS ‘i g = z a 
2 P ze ~ = a 35 = 30 i — = E 

5 p z Z z = = 33 = a 30 a 
F 5 z 3 > : = 
T ge = — 38 — = = 

H = S e ed ae = = ~~ 
I —.65 — = os ES. 5 Be Be ee = - 
L Si the =, = = 700 =.37 = es 
M = = 6 i = i a = = ie 
Wk a Se nd es Al — = 2 

o = = en — — 
: = = “ee 0 = —.32 = a A 
: = Se aap _ a 7 2 3 
0 48 = = g TE z = = = 7 38 
48 = = = a i 

è Tez O =a = aLa = * - = 


O L 


Measurement of Personality and Behavior Disorders 


ent factor scales. Meanwhile, however, this 
explanation rests on the indication that the 
highest correlation for a given music factor 
with any personality factor is also the highest 
correlation for the personality factor with 
any music factor. For example, factor 2 has 
its highest z with Q4, which z is also Q4’s 
highest with anything; factor 3’s highest is 
with H, which is also H’s highest; the factor 
4 column has its highest with M, which is also 
the highest in the M row, and so on, with very 
few exceptions (notably factor 8). . 
As to the consistency of psychological 
Meaning among personality factors associated 
With a given music factor we may mention, 1n 
addition to factor 1 above, that factor 2 cor- 
relates negatively both with paranoid tend- 
ency and nervous tension, which tendencies 
have been previously found associated by 
Darling (2); and that factor 4, which cor- 
relates essentially with M (“Unconvention- 
ality vs. Practical Concernedness”), also has 
Some association with Q2 (“Independent Self- 
sufficiency”), The alternative possibility is 
thus indicated, as suggested above, that where 
a music factor does not align itself with a 
first-order personality factor it may prove on 
further research to correspond to a second- 
order factor uniting the personality factors 
n some underlying common influence. For 
this reason music factor 1 has been called 
Tough Sociability vs. Tenderminded Indi- 


451 


viduality,” which contingently restricts the 
meaning pretty closely to the psychological 
bi-polarity of personality factor I, with which 
it is most associated, but also suggests fea- 
tures of the other factors with which it has 
some degree of association. The over-all de- 
scription of the personality dimension as- 
sociated with this particular music factor thus 
becomes remarkably similar to the Tender- 
vs. Tough-minded continuum described by 
William James (6). 


Results for the Abnormal Personalities 


As stated above, in the account of design, 
the test was administered to 98 hospitalized 
psychotics, divided into those four major syn- 
drome groups which had each a sufficient 
number of well-diagnosed cases to promise 
some significance of differences, if such should 
exist. 

The means and sigmas on all 11 factors are 
shown for normals, for abnormals as a whole, 
and for the four abnormal syndrome groups, 
in Table 3. 

The differences are examined below by the 
t test, first with respect to the differences be- 
tween the main psychotic group and the psy- 
chotic sub-groups, on the one hand, and the 
normal group on the other, with results as 
shown in Table 4. Nothing below a 10% 
probability is recorded in the P column. 


Table 3 
Scores of Normal and Abnormal Groups 
Schizophrenics T 
f li (D-P) Manics Paranoids 
Normals eg a n = 22 n=10 n=7 
n= ji 
n i Mean Sigma Mean Sigma Mean Sigma 
Fact r; an Sigma Mean Sigma 

= Mean Sigma Mean = a 7 134 3.7 15.8 4.6 124 63 
13.6 5.7 14.5 5 64 44 10.0 4.7 10.2 3.6 12.1 46 

: 10.7 44 8.7 g0 21 8.6 23 90 14 94° 17 
à 96 27 8.9 a 41 3.0 55 25 8.2 26 SA: 2i 
5 6.8 3.3 48 re 122 23 11.5 3.0 12.7 2.3 10.4 1.2 
122 2.8 11.6 aR 114 2.7 10.7 2.8 10.2 2.2 94 26 

? 84 3. 10.8 re za 78 26 72 i6 7 Bs 
Š 23 32 7.0 a 73 21 84 19 91 26 84 27 
; 8.0 26 A 90 21 94 20 90 2.2 81 14 
i we 3.0 ae 55 19 6.0 17 59 21 66 15 
n S6 21 F 59 66 2.5 54 3.0 36 24 34 24 

Gt 24 - : 


454 


texturally) music in favor of clear harmonic 
progressions, sweet melodies and subordinate 
accompaniment. The exception to this pat- 
tern is the manic group, which, on its distin- 
guishing factor (No. 4), prefers fast, ex- 
hilarating, stimulating pieces with textural 
complication, rhythmic variation and less 
obvious melodic outlines. These associations 
might roughly be explained in terms of em- 
pathy, but as more evidence accumulates 
they should receive more direct research in- 
vestigation, especially in the light of such 
research approaches as those of Rigg (7, 8). 


Summary 


1. A previously completed factor analysis of 
120 very diverse musical excerpts was used 
as a basis for construction of a Music Prefer- 
ence Test of Personality, set up to measure 
eleven factors by 100 items on two sides of a 
long-playing (33% R.P.M.) record. As the 
equivalence of the A and B forms is inade- 
quate for three or four of the factors, it is 
recommended that these be reserved for re- 
search improvement, by item analysis, and 
that the remaining seven or eight factors alone 
be used as internally valid measures in rou- 
tine applied psychology, notably in seeking 
external validities by predictions in clinical 
and guidance psychology. 

2. Since the established groupings of items 
do not correspond to musical schools or 
periods (though possessed of some consistency 
of musical character) it is hypothesized that 
they represent dimensions of 


r personality 
(especially of temperament) determining 
taste. Correlation with the 76 Personality 


Factor Questionnaire Test, on no 
lations of 102 and 71, confirmed ¢ 
ing many significant correlations. 

A one-to-one relation of music preference 
and personality factors cannot be proven by 
these results, since both measures of factors 
are imperfect. But the correlations, corrected 
for attenuation, are at least consistent with 
the hypothesis that, but for contamination 
the same personality dimensions determine, i i 
all but two cases, both the verbal and Te 


rmal popu- 
his by yield- 


Raymond B. Cattell and Jean C. Anderson 


music preference factors. Contingent titles 
have been given to the music preference fac- 
tors in accordance with the personality as- 
sociations. These titles proceed on the prob- 
ability that most music factors are primary 
personality factors though some may be 
second-order personality factors. 

3. Application of the Music Preference 
Test to 98 patients in mental hospitals re: 
vealed several factor measure differences, sig- 
nificant at the 1% level, between psychotics 
and normals and between various psychotic 
syndrome groups. If confirmed on further 
samples, these pattern differences are sO 
marked as to make the test a valuable adjunct 
to psychiatric diagnosis. The meaning of the 
music factors as indicated by the personality 
factor correlations agrees well with the mean- 
ing as found independently in terms of the 
associations with psychotic syndrome groups: 
These scales might therefore have value in 
throwing further light on individual psychotic 
syndromes. 


Received February 24, 1953. 


References 


- Anderson, H. H, and Anderson, G, H. Prajective 
techniques. Chap. 2. New York: Prentice 
Hall, 1951, 

- Cattell, R. B. A guide to mental testing. Lon 
don: University of London Press, Third Edi- 
tion, 1953. 

. Cattell, R. B. and Anderson, J. C. The Lele 
Music Preference Test of Personali'y: Be 
Institute for Personality and Ability ‘Testing: 
1608 Coronado Drive, Champaign, i 
1953. f- 

- Cattell, R. B. and Saunders, D. R. Musical an 
erences and personality diagnosis. I. et 
analysis of 120 themes. J. gen. pageran 
1953, in press. 

Cattell, R. B. and Wenig, P, W. Dynami 
cognitive factors controlling misperce 


F J. abnorm, soc. Psychol., 1952, 47, men 
- ames, W. Pragmatism: a new name 


bia of thinking. London: 


7. Rigg, M. G. Musical expression: an investig 
of the theories of Erich Sorantin. 
meN Psychol., 1937, 4, 442-455. i 
- Rigg, M. G. Speed as a determiner of mu 
mood. J. exp. Psychol, 1940, 5, 566-57! 


c and 
ption: 


ation 


sic 


Tue Joursat or AppLIED Psycuonocy 
Vol. 37, No. 6, 1953 


A Rating-Scoring Method for Free-Response Data 


Ralph R. Canter, Jr. 


University of California, Berkeley 


In a study designed to evaluate a human 
relations training program for executives and 
Supervisors (reported in detail elsewhere, 1) 
a forced-normalizing method was employed to 
Score written answers to open-ended questions. 
Answers to four questions contained in a 
Specially prepared Supervisory Questionnaire 
Were evaluated by four raters in accordance 
With the procedures outlined in this paper. 
This questionnaire was one of a number of 
tests and other questionnaires administered 
to an experimental (trainee) group and a 
Matched control group. The N was 18 in 
each group. : 

The questions used in the Supervisory 
Questionnaire were developed in light of three 
Considerations: (a) the trainees should be 
given an opportunity to express themselves in 
their own words concerning the kinds of prob- 
lems they had in their jobs; (b) the questions 
Should not be drawn from the course content, 
but should be directed toward the individual 
Supervisor in his job; and (c) the questions 
Should not be structured in such a fashion 

at certain kinds of answers would seem im- 
Portant, As an example, one question used 
Was: “Do you feel you have the kind of 
“operation from your employees that you 


Want? What do you think accounts for 
this?» 


1 
Rationale and Procedure 


It was hypothesized that if we were to take 
all the written answers to a single question 
made by the persons in both the experimental 
(E) ang control (C) groups and have raters 
rod them into » categories, the E and C el 
“St distributions should be almost identica 
and have the same mean. However, it was 
thought the E responses following training 


F, M. Fletcher of 


YTHS wri i 
Ohig £ Writer is indebted to Dr. AEN 
this. State University for assistance in g 


Corder Pod. 


is Board, 
and De Resear elli and Mr. 


, niversity of California. 455 


would be distributed by the raters in such a 
manner that the E mean would be reliably 
higher than the C mean, thus enabling us to 
conclude that the training was effective in 
producing change along the dimensions of the 
questions used.* The forced normal distribu- 
tion of judgments was considered as an effec- 
tive method to use in this situation. The de- 
rived procedures will be described in the con- 
text of the investigation. 

Since there was a total of 36 E and C re- 
sponses to each question, a normal area dis- 
tribution with an N of 36 was determined for 
seven categories, this number being arbitrarily 
used because of the small N and the relative 
ease for the raters. The numbers of cases 
falling in the respective categories were: 13, 
8, 12, 8, 3, and 1. 

Four raters were used, all being social 
scientists experienced in dealing with written 
questionnaire item responses and having spe- 
cific knowledge of desirable supervisory prac- 
tices and qualities. -Each was given written 
instructions, a summary of which follows. 
The general nature of the task was described 
and the specific questions were listed. The 
rater was asked to judge the responses to these 
questions in terms of the degree to which they 
reflected over-all supervisory quality. The 
rater was told that the responses came from 
Ss in E and C groups, but that the procedure 
required him to be in ignorance of whether a 
respondent was in the E or C group. He was 
then instructed to sort the responses to the 
first question into the seven categories in ac- 
cordance with the assigned numbers (i.e., 12 
responses in Category No. 4, 8 each in Cate- 
gory No. 5 and No. 3, and so on). The exact 
procedure was described, essentially involving 

2In.the major study (1) the trained supervisors 
were found to have gained in mean score at a sta- 
tistically significant level of confidence over the un- 
trained supervisors on the Supervisory Questionnaire, 
This measure also intercorrelated highly with other 


tests and measures on which statistically significant 
gains were found. 


456 


separation of the best and poorest responses 
at each step. He next proceeded to the 
second question and so on. The rater was 
never informed as to whether he was dealing 
with a set of pretest or posttest responses. 
Each response was scored by summating 
the category values assigned by the four 
raters. The number of a category was used 
as a score. For example, the four judges may 
have respectively placed a response in the 
following numbered categories: 2, 3, 3, and 4. 
The response would receive a score of 12. 
Each individual’s four scores were then sum- 
mated, this being his over-all questionnaire 
score (which was treated statistically within 
the framework of the larger investigation). 
Records by separate item scores and by 
total scores were kept for each rater so that 
inter-rater reliability could be estimated. 


Inter-Rater Reliability 


Table 1 contains the Pearsonian correlation 
coefficient between raters on the summated 
questionnaire score (i.e., total of scores as- 
signed by each rater to each respondent on 


Table 1 


Inter-Rater Reliability Coefficients for Supervisory 
Questionnaire Summated Scores i 


Pretest 
Rater A B Č mar 
B 52 
C .52 63 
D 54 63 -66 
Average Intercorrelation Coefficient* 58 


Total Summated Questionnaire Score Reliability 
(Corrected by Spearman-Brown formula with 
four raters) 


85 
Posttest 
Rater A B c 7 
B mi - 7 
ic 09 68 
D .68 .70 al 
Average Intercorrelation Coefficient* 65 
Total Summated Questionnaire Score Reliability l 
(Corrected by Spearman-Brown formula with 
four raters) 88 


* Obtained by formula 118, p. 197, 


Pete i 
Voorhis (3). Presna Vai 


Ralph R. Canter, Jr. 


each of the four items) for both the pretest 
and posttest. The inter-rater correlations are 
not reported for the four separate items; We 
wish only to note that the range of these cor- 
relations on the pretest was from 0.31 to 0.71 
with a mean of 0.47, and 0.26 to 0.79 on the 
posttest with a mean of 0.49. 


Discussion 


The inter-rater reliabilities appear to be 
quite adequate and within the usual range of 
reported reliabilities. Rating each question 
separately has the effect equivalent to adding 
more raters (2). Also, it is possible to as- 
sume that a fairly high degree of unidimen- 
sionality is accorded to each item (the rater 
has only a single question to keep before 
him). The responses can be viewed as homo- 
geneous since the rater has only a single 
criterion to keep in mind—in this study good- 
ness of response as related to supervisory 
quality. These conditions act to increase 
reliability. 

In using a technique such as this it ap- 
pears that some of the hazards involved 1" 
trying to get scales can be avoided. How- 
ever, Suchman (4) has pointed out the dif- 
ficulties with “non-itemized” judgments ae 
ratings, noting especially that such pioa 
dures produce no definition of the varta p 
under consideration, With this we must COP- 
cur. But much depends upon the uses to 
made of such ratings. In the example US° 
the major intent was to determine whethei 
the training appeared to have any effect ae 
on the trainees’ free responses about how the 
performed in their jobs. to 

Subsequent studies would be required cur 
specify the correlations between the ps 
lar training content and the observed effec ot 
as usually is the case. From this standpo™™ 
the method proposed here is best viewed o 
one which determines whether further s 
ies are warranted. 


é 


Summary 


“= 


A Rating-Scoring Method for Free-Response Data 457 


evaluating a human relations training course 
was described wherein the criterion used by 
four judges was over-all supervisory quality 
as revealed in pretest and posttest written 
responses made by experimental and control 
subjects in regard to their job performance. 
Satisfactory inter-rater reliabilities were 
found, 


Received February 2, 1953. 


References 


. Canter, Ralph R. An experimental study of a 
human relations training program. J. appl. 
Psychol., 1951, 35, 38-45. 

. Furfey, P. H. An improved rating scale tech- 
nique. J. educ. Psychol., 1926, 17, 45-48. 


3. Peters, C. C. and Van Voorhis, W. R. Statistical 


procedures and their mathematical bases. New 
York: McGraw-Hill, 1940. 

. Suchman, E. A. The logic of scale construction. 
Educ. psychol. Measmt., 1950, 10, 79-93, 


THE JOURNAL OF APPLIED PSYCHOLOGY 
Vol. 37, No. 6, 1953 


Factors Affecting Student Evaluation of College Faculty 
‘ Members 


Alexis M. Anikeeff 


Department of Institutional and Industrial Management, Mississippi State College * 


For the purpose of evaluating the potential 
usefulness of a student evaluation program 
of college faculty members, an investigation 
was initiated to determine the effect of grad- 
ing leniency upon merit rating scores of 
faculty members in the School of Business 
and Industry. To the extent that grading 
leniency was highly correlated with merit rat- 
ing scores, the usefulness of student evalua- 
tion could be seriously questioned. 

As a corollary of the basic study, the rela- 
tionship between absence extensiveness and 
faculty ratings was investigated to supple- 
ment data uncovered in a previous study 
about the effectiveness of an unlimited ab- 
sence regulation. To the extent that highly 
rated instructors attract students to their 
classrooms despite an unlimited absence regu- 
lation, restricting the number of absences 
which students are allowed may merely serve 
to bolster the security feelings and egos of 
lowly rated instructors, and further frustrate 
the students who are engaged in the program 
of evaluating faculty members. 


Procedure 


Grading leniency was determined by de- 
riving the arithmetic mean of quality points 
issued by each faculty member to his stu- 
dents, and then, by ranking each faculty 
member according to obtained means, Owing 
to the selection process operating during a 
four year college curriculum, the average 
class grade increases progressively from the 
freshman to the senior level. Consequently, 
in order to control variation based upon aca- 
demic level, rather than upon faculty leniency 
separate ranking distributions were made for 
freshman-sophomore and junior-senior levels. 

The determination of absence extensiveness 
was accomplished by ranking nineteen 


c faculty 
members according to the median nu 


mber of 
1 Now at Oklahoma A & M College. 


458 


class absences accumulated in their classes. 
Ranking distributions were made for fresh- 
man-sophomore and junior-senior levels, as 
well as the freshman through senior levels 
combined. 

Since the evaluation data of faculty mem- 
bers were available in ranked form, Spearman 
rho was used to determine the relationship 
between grading leniency and merit rating 
scores. The same technique was used to un- 
cover the relationship between absence exten- 
siveness and merit rating scores. In addition, 
multiple correlation analysis was used to de- 
termine the combined effect of grading len 
ency and absence extensiveness upon the merit 
rating scores of faculty members. 

Absence and grading data are based upon 
reports which 19 faculty members submitte 
to the Dean of the School of Business and In: 
dustry. Faculty evaluation data are pase 
upon information secured during regulat aan 
periods by members of an honorary scholas© 
fraternity after mid-semester grades ui 
posted, but before final grades were assigne t 
Instructors were rated on an eight-por” 
graphic rating scale which permitted distribu 
tion of judgments along a continuum of =. 
verbally described points. The following, “ai 
tors were used: 1. knowledge of the subjer : 
2. class preparation: 3. clarity of speech; ie 
avoidance of sarcasm; 5. fairness in se 
6. absence of mannerisms: 7. creation al pa 
terest in subject matter; and 8. ability tO eg 
trol temper. am, 

Under provisions of the evaluation pron a 
an instructor left his classroom earlier a 
usual and permitted students to rate him was 
ing his absence. Whenever an instructor 
rated by less than 50 students, or by les fro 
three different classes, he was exclude 


- ex 
the standardizing population, and is alea ned 
cluded from the present study. The ob oxi” 


tanking distributions are based upon aPP! 
mately 1,500 cases. 


Student Evaluation of College Faculty Members 


Results 


On the freshman-sophomore level, a very 
significant and moderately high positive rela- 
tionship is found between grading leniency 
and faculty merit rating scores. Consult Table 
1 for more specific information. Through the 
use of a coefficient of determination described 
by Guilford," it is evident that approximately 
53% of the variance in the rating received 
by an instructor who teaches freshman-sopho- 
More level courses can be attributed to grad- 
ing leniency. No significant relationship is 
found between obtained rating and grading 
leniency on the junior-senior level. On a 
combined basis, freshman through senior, a 
Significant but moderate relationship exists 
between obtained rating and grading leni- 
ncy. Approximately 25% of the variance in 
an instructor’s rating can be attributed to 
Srading leniency on an over-all four-year basis. 


Table 1 


Spearman Rank-Diference Correlations for Faculty 
Members Ranked on Three Factors 


Compared Rankings 


Grades Rating Grades 

vs. Ab- vs. Ab- yS: 

Class Level N sences sences Rating 
Freshman-Sophomore 13 —.21 —.17 1a 
snlorSenior 17-19 26 43 
“teshman-Senior 19 —.08 ss M 


rm Significant at the five per cent leat 
ignificant at the one per cent level. 


Sis: saver an , ; nd be- 
No significant relationships are fou 


Ween obtained rating and absence extensive- 
ness, on either the freshman-sophomore Or the 
“Nior-senior levels, However, a significant, 
“8ative, and moderate relationship 1S found 
‘Ween instructors ranked according to mene 
Mg scores obtained by student evaluation 
the same instructors ranked according to 


the Median number of absences accumulated 
yeat breakdown.- 


umber of 
Approxi- 
bsence ex- 


Tat 


eir classes, on a four- 

Us, an: instructor with a lower n 

te receives the higher ratins- 

tely 28% of the variance in a 

che tilora, J: P: Fundamental sigis a 

$89 and “education. (2nd Ed) 5° 
ill, 1950. 


459 


Table 2 


Multiple Correlations Between Faculty Ratings and 
Grading Leniency Combined with 
Absence Extensiveness 


Correlation Data 


Class Level N R SEr 
Freshman-Sophomore 13 3 AS 
Junior-Senior 17 Al 21 

se 13 


Freshman-Senior 19 


** Significant at the one per cent level. 


tensiveness can be accounted for by the merit 
rating scores of instructors. By the same 
token, 28% of the variance in an instructor’s 
rating can be attributed to absence extensive- 
ness. 

No significant relationship is found be- 
tween instructors ranked according to grading 
leniency and instructors ranked according to 
the median number of class absences accumu- 
lated in their classes. The lack of such rela- 
tionship is evident on the freshman-sopho- 
more, junior-senior, and the combined fresh- 
man through senior levels. 

When absence extensiveness and grading 
leniency are combined, and the overlap be- 
tween these factors is held constant, the com- 
bination of these factors accounts for ap- 
proximately 53% of the variance in an in- 
structor’s rating on the freshman-sophomore 
level. Consult Table 2 for detailed data. 
The same combination of factors accounts for 
about 22% of the variance in an instructor’s 
rating on the junior-senior level, and for ap- 
proximately 50% of an instructor’s rating 
variance on the freshman through senior 
levels. 


Discussion 


For the sample used, the grades which 
faculty members assign students are reflected 
in the quality of rating which students assign 
faculty members. The extent to which stu- 
dents evaluate faculty members according to 
class grades received varies with academic 
levels of the students. Grading leniency ac- 
counts for almost three times as much vari- 
ance in faculty ratings on the freshman-sopho- 
more level, as it does on the junior-senior 


460 


level where the relationship is not statistically 
significant. Conceivably, students who sur- 
vived the selection process operating during 
the first two years consider faculty grading 
leniency as a relatively unimportant criterion 
on which to base evaluation of faculty 
members. 

Class absences are negatively correlated 
with faculty ratings. These results may indi- 
cate varying degrees of student interest in the 
classroom behavior of faculty members. If 
such is the case, lowly rated faculty members 
apparently repel students from their class- 
rooms, and accordingly, accumulate dispro- 
portionate numbers of class absences. How- 
ever, in the interpretation of the relationship 
between absence extensiveness and faculty 
rating scores, it should be noted that the fac- 
tor of absence permissiveness remains uncon- 
trolled. 

Absence permissiveness could operate di- 
rectly or indirectly. Direct operation could 
involve open avowal or subtle implication, on 
the part of highly rated instructors, that class 
attendance is not required for satisfactory 
course performance. Conversely, lowly rated 
instructors may insist upon daily attendance 
to the point where daily recitation grades 
carry an unduly preponderant weight in the 
determination of final class grades. Loading 
course examinations with textbook questions 
while minimizing the inclusion of lecture ques- 


Alexis M. Anikeeff 


tions, could illustrate the manner in which 
indirect absence permissiveness would operate. 


Summary 


Nineteen faculty members were ranked in 
accordance with the merit rating scores as- 
signed to them by their students. Using 
Spearman rho, merit rating ranks were cor 
related with grading leniency and absence ex- 
tensiveness rankings of the same instructors. 

1. Grading leniency correlated highest with 
merit rating scores on the freshman-sopho- 
more level, and lowest on the junior-senior 
level. 

2. Absence extensiveness correlated nega- 
tively on all academic levels, but the correla- 
tion was significant only on the combined 
four-year breakdown. 

3. The selection process operating during 
the freshman-sophomore years could reason- 
ably account for a low and statistically no" 
significant relationship between grading len 
ency and student evaluated faculty merit 
ranking on the junior-senior level. 

4. Class interest of students could account 
for the negative relationships between faculty 
members ranked according to the number O 
class absences found in their classes and tne 
same instructors ranked according to teaching 
competence as evaluated by students. 


Received March 6, 1953. 


a 


Tue Jovrnat or Arpiirp PsycuoLocy 
Vol. 37, No. 6, 1953 


Estimating Grade Reliability * 


Scarvia B. Anderson 


25 


Naval Research Laboratery, Washington 25, D. C. 


When grade point ratio is used as the 
criterion of school “success” or “failure,” the 
need for an adequate estimate of the relia- 
bility of the ratios presents a recurring prob- 
lem. The author encountered the problem 
most recently in a study at George Peabody 
College for Teachers of the value of certain 
entrance tests in predicting freshman grade 
Point ratio (1). If the results were to be 
used for the selection and counseling of stu- 
dents and for the determination of an ade- 
quate testing program, some estimate of the 
reliability of the criterion seemed essential. 

The tests used were the American Council 
on Education Psychological Examination; the 
Cooperative Reading Comprehension Test; 
the Cooperative Mechanics of Expression 
Test; and Otis Quick-Scoring Mental Ability 

ests for grades four through nine, which 
Were used as practice tests. Multiple cor- 
relations of test scores with weighted grade 
Point ratios were .62 and .59 for one quarter 
and three quarters, respectively. It was rea- 
Sonable to assume that grade point ratios for 
three quarters were more reliable than those 
Or one quarter, but a statistical estimate of 
Such reliability seemed desirable in the final 
'nterpretation of the results. 
Sean (4) has presenti 

on analysis of variance, 10 2: 
reliability of sets of ratings, and in addition 

as considered in some detail the relationship 
tween this procedure and those of Pe 
5), Snedecor (7), Clark (2), Peters (6), am 

ùreton (3). He concludes that the mi 

çass formula, such as his, Cureton S$, i 
nedecor’s, is generally preferable to the 
erage intercorrelation or generalized re- 


to Dr. Julian C. 
for material help 
and to Dr. - 
pe Clarence W. 


a method, based 
r estimating the 


1 
Saget author is deeply indebted 
in tho? University of Wisconsin, 
Curets Preparation of this article 
on, University of Tennessee, 
“pene, “George Peabody College for ‘Teachers, Pi 
th ml Opinions and assertions containes Aah 
str ed ate ones of the writer and BEE hes 
Departs, Oficial or reflecting the views of i 


Ment or the naval service as 2 W wt 


liability formula, such as Horst’s, Clark’s, and 
Peters’. 

In the present paper, we shall discuss: (a) 
the differences obtained when the Horst and 
Cureton formulas were used to estimate the 
reliability of the same sets of freshman 
grades; and (b) a second problem, which 
arose in the application of the formulas, the 
use of unweighted versus weighted grade point 
ratios. 

Both Cureton’s formula (which, inciden- 
tally, is the result of a derivation parallel to 
Ebel’s) and Horst’s are based on the well- 
known generalized formula for the reliability 
coefficient: 

r=1- gh 
oo 

In application here, the error variance is an 
estimate of the error variance of the individ- 
ual means, and the observed variance is the 
variance of the means for all of the in- 
dividuals. 

Cureton’s and Horst’s formulas are shown 
in Table 1. The chief statistical differences 
between them may be summarized as follows: 

1. Cureton uses a weighted variance for 
the estimate of the error variance, and Horst 
does not. 

2. Cureton uses a weighted variance of the 
person means, and Horst does not. 

3. Cureton divides by V — 1 in the variance 
of the means, and Horst divides by V. 

A careful study of the two methods and 
their respective relevance to freshman grade 
point ratio suggested that Cureton’s technique 
was more appropriate for our use. If our 
freshmen were considered a sample of a uni- 
verse of Peabody freshmen, the relevance of 
dividing by N — 1 was apparent. In addition, 
we agreed with Cureton that his formula 
would give a somewhat better reliability esti- 
mate, since the values he uses for error vari- 
ance and total variance of the person means 
are “statistically independent in the sense of 


Scarvia B. 


Tab 


Anderson 


el 


Formulas for Estimating Reliability of Means of Unequal Numbers of Scores* 


Horst (5) Cureton (3) 
Bs (iC) 
ot am 
a Dn:(M:— M} (20) 
CS a A 
B(M: — M} ý 
Aye EUL = MY (2H) 
s Sx (GC) 
~ ni — 1D ere 
rel = MF we} 
SX? > Më N-1 — OSH 
ratty aly 1- (AS 
ei g i) r :— NNS M:SX;:) — MZSX;: 
r= e (aH) a en (aC) 


* N = number of individuals, Sx; = sum of the deviation scores for individual i, SX: n 
scores for individual ¿, #; = number of scores for individual i, » = mean number of scores for ) 
M; = mean score for individual i, M = mean score for N individuals. 


analysis of variance, while those given [by 
Horst] . . . are not quite independent” (3, 
p. 2). Still more important, however, since 
the results of the freshman test study were to 
be used for predictive purposes, the weight- 
ing of the variances, so that a mean based on 
a smaller number of measures would receive 
a smaller weight than a mean based on a 
larger number of measures, should furnish a 
better population estimate of reliability. 

However, it was decided to use both meth- 
ods in estimating the reliability of the fresh- 
man grades for one quarter and for three 
quarters, in order that the results might be 
compared for discrepancies.” 


At this point, it seems well to consider 
briefly the first problem encountered in ap- 
plication of the two formulas. In the original 
analysis of the relationship between grades 
and test scores, grades were weighted accord- 
ing to the number of quarter hours that a 


2 Tt is realized that any technique for estimating 
reliability assumes independence of measures. In the 
case of grade point ratio, it is difficult, if not impos- 
sible, to meet this assumption. In addition to the 
possibility of teachers discussing among themselves 
the ratings they give students, a single grade point 
ratio may contain grades from two or more courses 
obtain a statistical 
one seems to have 
standard reliability 
ons Imposed by the 
ollege grading. 


under one professor. In order to 
estimate of reliability, however, 
no choice but to use one of the 
formulas, recognizing the limitati 
conditions usually surrounding ci 


= sum of the mr 
individuals, 


course carried: 


ndi- 


weighted grade point ratio for 1 
vidual i. 


. r one 
number of quarter hours in p 
course c that individual i takes- 


t i : one 
points assigned to a grade me 
course ¢ that individual 7 takes. i 
vidual 


ni = number of courses that indi 


takes. 


ve hou! 
reliable 

for 
eet 


It was reasoned that a grade in a fi 
course would be considerably more 
than a grade in, say, a two hour courses 
a five hour course the instructor WOU rob” 
with a student more frequently, would Pay, 
ably give more quizzes, and would ge? sub” 
be in a better position to give a More ` pis 
Jectively) reliable grade. However: i 


_ * Points were assigned on the basis of the ii 
ing scale: A += 12 points, A = 11 points, / 
points, B+ =9 points, B =8 points, B7 = 
C+=6 points, C=5 points, C— = 4 Peat 
=3 points, D=2 points, D—=1 pol” 
points. 


~a 


Estimating Grade Reliability 463 


viewpoint was maintained, several adjust- 
ments would have to be made in order to use 
Cureton’s or Horst’s formula. We could not 
let 7; equal the number of quarter hours taken 
by an individual, since that interpretation 
would not be compatible with the original 
meaning of 7; in the formulas. A letter from 
Dr. Cureton granted that a grade in a five 
hour course would probably be somewhat 
more reliable than a grade in a two hour 
course; however, he indicated that the differ- 
ence would be less than might be suggested 
by the weights of 5 and 2, for in most courses, 
regardless of the number of hours, a final 
examination is given and instructors gen- 
erally try to administer enough tests to give 
(subjectively) reliable grades. 

Unweighted grade point ratios were com- 
Puted as follows: 


ni 


È Pa 
T 
ni 


where JZ; = unweighted grade point ratio for 
individual i, 

These correlations between weighted and 
unweighted grade point ratios were obtained: 


98 between weighted and unweighted grade 


point ratios for three quarters. ' 
96 between weighted and unweighted grade 


point ratios for one quarter. 


As a result, any advantage of weighting 
Seemed so small that we were satisfied to g0 
ahead with the calculation of the reliability 
Coefficients, using unweighted grade ings 
řatios and letting m equal the number 0 
Courses that individual i took. 

The Cureton raw-score formula (4C) 
Used first, and then since Mi, M, and SX; 
Were already known, the raw score ae 

4H) of the Horst formula was used. e 


A J- 
Tesulting reliability coefficients were as fo! 
lows; 
t 
Grades for Horst — 
quarters 4 ‘63 
quarter : 


are identical, 
y between the 
e is logically 


Although the first two 7° 
ere is a rather wide discrepan¢ 
two rs. It seems that ther 


th 


no statistical method for testing the sig- 
nificance of the difference between these two 
estimates of the reliability of the same meas- 
ures, and one must return to the original 
formulas for an explanation of the numerical 
differences. The fact that the reliability co- 
efficients for three quarter grade point ratios 
are the same and for one quarter grade point 
ratios are different is directly attributable 
to the weighting process used in Cureton’s 
formula. With the large values of Yr; (Mn; 
equals 13.48) for three quarters, the weight- 
ing seems to have a negligible effect on the 
value of r; while with the smaller X7; (Mn: 
equals 4.62) for one quarter, the weighting 
process results in a considerable difference be- 
tween the variance estimates substituted in 
Cureton’s and those substituted in Horst’s 
formula. Cureton’s formula gave for one 
quarter a considerably smaller estimate of 
error variance than did Horst’s. 

The immediately obvious conclusions that 
were drawn from the computed reliability co- 
efficients were that: (1) the use of one quar- 
ter’s grades alone would not be adequate for 
our purposes; and (2) three-quarter grade 
point ratios represented a fairly reliable cri- 
terion.* The difference obtained between the 
Horst and Cureton coefficients for one quarter 
did not affect these conclusions. If there had 
been a question of interpretation, we should 
have used the reliability estimate given by 
Cureton’s formula for the reasons already 
given. 

In other cases where reliability estimates 
of unequal numbers of ratings were to be 
made, we would generally tend to use Cure- 
ton’s formula when we were interested in re- 
liability for the prediction of population be- 
havior from a sample of that population or 


4 Interpretation is much more difficult than this 
statement indicates, however. Since many of the 
students had fewer different teachers than quarter- 
courses during the three quarters (especially in the 
required English sequence), grades received by an 
individual from quarter to quarter were by no means 
independent of each other. How much less the esti- 
mated reliability coefficient of .90 would be if true 
independence existed cannot be judged from these 
data. It is interesting to note that despite markedly 
poorer reliability of first-quarter grade point ratios 
the multiple R between test scores and first-quarter 
GPR’s was slightly higher than for the entire year 
(.62 vs. .59). 


466 


Table 2 


Means and Standard Deviations for Test Scores 
and Course Grades 


Standard 

Variable Mean Deviation 
ACEQ 41.7 7.9 
ACEL 61.2 16.3 
Cooperative English 141.2 29.5 
Cooperative Algebra 34.0 12.2 
Minnesota Paper Form Board 45.3 7.9 
Bennett Mechanical 32:7 12.6 
Mathematics 6.3" 2.8 
English 6.1" 2.5 
Engineering Drawing 7.45 2.0 
Civil Engineering 2.9 1.2 
Mechanical Engineering 24 1:2 
Engineering Problems 2.7 1.3 


2 Summation of three grades. 


fall quarter, 1950, through the fall quarter, 
1951, constituted the criteria for the study. 
Freshman year grades generally take care of 
most of the screening of engineering candi- 
dates at the University of Tennessee, as fail- 
ures are much more unlikely after the first 
few quarters in the engineering curriculum. 
The entering class is a relatively heterogene- 
ous group, as no selection procedures are used 
other than a minimum mathematics require- 
ment of four high school units. 

Instead of using the mean point hour ratio 
for all courses combined, correlation coef- 
ficients were computed for the various tests 
with grades in the courses in the freshman 
engineering curriculum. Table 1 shows the 
various courses and tests for which correla- 
tions were computed. 


William C 


oleman 


Discussion of Results 


Though the coefficients found in Table 1 
are not especially high, several of them are 
sufficiently so to be regarded as useful for 
selection or guidance situations. With a 
population consisting of high school students, 
a higher correlation would be hypothesized 
for this more heterogeneous group. 

The best predictive instrument in the bat- 
tery used seems to be the Cooperative Alge- 
bra Test; the Cooperative English Test ranks 
second. The Bennett Mechanical Compre 
hension Test tends to get better correlations 
with grades than either of the A.C.E. scores: 
In an unpublished master’s thesis at the ve 
versity of Tennessee, Tarvin (6) found tha 
the algebra and English tests yielded higher 
correlations than either A.C.E. score among 
freshman students. From these data an! 
other studies (4, 11), the predictive value be 
so-called scholastic aptitude tests such as r 
A.C.E. must be questioned in comparison 4 
outright achievement tests. 

Further examination of Table 1 will r : 
that in different courses different instrumen, 
may be the most effective predictors. ig 
no surprise to find the algebra test predictie 
mathematics grades best, and the English ma 
performing in a similar fashion for nglis 
grades. The Bennett is clearly the best 
dictor in engineering drawing instead of ze 
Minnesota Paper Form Board as might ha 
been expected. No test emerges as @ 8 a 
predictor for civil engineering. This maY 4. 
flect to some extent the unreliability of gra is 
in this course though further evidence 


eveal 


E 


A ang 

needed, In mechanical engineering the E 

Table 3 

Multiple Correlation Work Sheet Te 

relati , Test S 
University of Tennessee Bigioeering ag = 
Criterion Test Predictors R X 
Mathematics Alg. (.558) + Bennett ( ; 
s ` 612)** + Eng. ; 

English i Eng. (.608) + Alg. (.614) 4. Bennek y 62 i 
Engr, Drawing Bennett (453) + Alg. (.505)*4 ACE L ( sats Sa > 
C Faer Alg. (.290) + Eng. (.310) + Minn an a H 
Mech, Engr. Bennett (.556) + Eng. (.664)** + Ale. (63 a 58 
ngr. Problems Alg. (496) + Bennett (.595)** 4. keen TR EES aia 4 


** Increment in R significant at 1% | 
i igni evel. 
* Increment in R significant at 5% eve 


Test Battery for Predicting Freshman Engincering Grades 


lish test stands out as the best predictor. Does 
this reflect an emphasis in grading in this 
course on competence in English usage? The 
algebra test and the Q score seem to be 
the best predictors in engineering problems. 
though the Bennett provides a moderate cor- 
relation coefficient. 

It is interesting to note that the Q score is 
More valuable than the L for this engineering 
group in the courses considered. This. of 
course, is contrary to the usual findings with 
the A.C.E. in other curricula (3, 11). The 
Minnesota Paper Form Board yielded gen- 
erally the lowest correlation coefficients of 
any of the tests. 

Multiple correlations were then computed 
for four of the criterion variables, grades 
ìn English, engineering. drawing, engineering 
Problems, and mathematics. Table 3 presents 
these data showing the best multiple correla- 
tions that can be obtained with the tests used. 

The addition of further tests does not add 
much in the case of English and mathematics 
Where the zero order correlations were moder- 
ately high in the first place. In engineering 
drawing and engineering problems the extra 
tests appreciably contribute in improving the 
Correlation coefficients, from .496 to .612 for 
the problems course, and from .453 to 541 
Or the drawing course. Additional tests seem 
Warranted for more reliable prediction in the 
Case of these two courses. 


Summary 


In conclusion, it can be stated that this 
Study has demonstrated the satisfactory ap- 
Plicability of several economical (in terms of 
administration and cost) tests for the prob- 
m of selecting or guiding prospective RSF 
Neering students. The tests which produce 


467 


the best correlation coefficients were the Co- 
operative Algebra, Cooperative English, and 
the Bennett Mechanical Comprehension Tests. 
The A.C.E. Psychological Examination and 
the Minnesota Paper Form Board were not 
as adequate. The findings in this study seem 
to generally confirm those of previous in- 
vestigators. 


Received February 11, 1953. 


References 


1. Berdie, R. F. Differential aptitude tests as pre- 
dictors in engineering training. J. educ. Psy- 
chol., 1951, 42, 114-123. 

2. Berdie, R., Dressel, P. and Kelso, P. Relative 
validity of the Q and L scores of the ACE 
Psychological Examination. Educ. psychol. 
Measmt., 1951, 11, 803-812. 

3. Cole, A. W. Predicting success in engineering. 
Department of Vocational Education, Univer- 
sity of Arkansas: Fayetteville, Arkansas, 1951. 

4, Hellmer, L. A. Unpublished data from the Uni- 
versity of Illinois, 1953. 

. Johnson, A. P. College Entrance Examination 
Board Mathematical Tests: (a) and the Pre- 
Engineering Inventory; and (b) as predictors 
of scholastic success in colleges of engineering. 
Amer. Psychologist, 1950, 5, 353. (Abstract) 

6. McNemar, Q. Psychological statistics. New 
York: Wiley, 1949. 

7. Moore, J. E. A decade of attempts to predict 
scholastic success in engineering schools. Oc- 
cupations, 1949, 27, 92-96. 

8. Pierson, G. A. and Jex, Frank B. Using the 
Cooperative General Achievement Tests to 
predict success in engineering. Educ. psychol. 
Measmt., 1951, 11, 397-402. 

9. Stuit, D. B. Predicting success in professional 
schools. Washington, D. C.: American Coun- 
cil on Education, 1949. 

10. Tarvin, J. C. Prediction of freshman course 
grades at The University of Tennessee. Un- 
published M.A. Thesis, University of Ten- 
nessee, 1951. 

11. Wallace, W. L. The prediction of grades in spe- 
cific college courses. J. educ. Res., 1951, 44, 
587-597. 


n 


Tue JOURNAL OF APPLIED PSYCHOLOGY 
Vol. 37, No. 6, 1953 


Academic Achievement in Engineering Related to Selection 
Procedures and Interests 


Louis Long and James D. Perry 
Division of Testing and Guidance, The City College of New York 


Over the past ten years the matter of im- 
proving the selection of engineering students 
has been under investigation at the City Col- 
lege. A variety of tests has been used in 
conjunction with the high school average in 
determining which students should be ad- 
mitted. The ACE Psychological Examina- 
tion, the Cooperative General Achievement 
tests, and the Pre-Engineering Inventory have 
been used as well as some special tests con- 
structed for the college by Kenneth W. 
Vaughn. Tests have been added and elimi- 
nated on the basis of studies relating test 
scores to academic grades. At this point the 
program has become fairly stable and conse- 
quently it was felt that a report based on the 
present battery might be of interest to other 
colleges. 

Specifically this study was designed with 
two purposes in mind: to evaluate the effec- 
tiveness of high school averages and scores 
on entrance examinations as a basis for pre- 
dicting four-year grade-point average in an 
engineering college, and to study the relation- 
ship between the four-year college grade-point 
average and ratings on two standard interest 
questionnaires (the Strong Vocational Inter- 
est Blank and the Kuder Preference Record), 
It was also hoped that ratings on the Kuder 
obtained from students during their freshman 
year might be compared with those obtained 
during their senior year. The number of stu- 
dents taking the questionnaire on both occa- 
sions was small, but since there are not much 
data of this type available we shall, 
theless, present them. 

The results reported are based on students 
graduating from the School of Technology of 
the City College during the calendar year of 
1951. Of the 521 graduates, 433 were in- 
cluded in one or more phases of this study, 
Since all available data were used in each part 
of the study the number of cases varies from 
one part of the study to another, 


never- 


468 


A four-year college average (weighted ac- 
cording to credits and grades), calculated by 
the School of Technology,? was available for 
each graduate. 


Effectiveness of Selection Techniques 


Studies based on data from previous enter- 
ing classes have indicated that first Tonn 
grades at City College can be most effectively 
predicted by using a Composite Score based 
upon high school average (weight of five), 
and scores on the following tests: Scientific 
Verbal Ability (weight of one), Comprehen- 
sion of Scientific Materials (weight of two); 
and General Mathematical Ability (weight 
of two).? 

The intercorrelations between the ie 
year college average, high school avera’e, ane 
the scores on the three tests entering into HR 
Composite Score are presented in Table l 
along with the means and standard deviations 
for these variables. The correlations between 
the four-year college average and the other 
variables range from 0.30 to 0.50. The cot- 
relation between the Composite Score and the 
four-year college average is 0.53,! which 1$ 


* The authors would like to take this opportunity 
to thank Professor John R. White for making t° 
material available to us. per 

? These weights were determined by means of ese 
gression equations in which the effectiveness of thes’ 


three tests as well as the following were determing al 
Psychological Examination, General Ver 
Ability, Social Science 


bility, Verbal Ability, and Spatit 
xisualizing Ability, AN of the tests, excep 
» are part of the Inventory of Scholastic 3 
and were developed by Kenneth W. Vaughn. >y 
ya of the tests are similar to those included or} 
ae m the original Pre-Engineering Inven pce 
). For a detailed description of the entrar’. 
n atio Program at the City College the rea 
Mee to an article by Long and Perry (5). are 
th may be of interest to the reader to comb rae 
fire gorrelations with those reported in the Hitoy 
K. nde gether colleges [2, 6; see summaries 
del (4), Moore (8), Stuit (11)1. 


about aoe of the test scores has been reduc 


a cent due to the elimination of stl the 
teste g four-year period. The effectiveness ° cor 
Sts is thereby reduced. In previous studies 


Academic Achievement in Engineering 


469 


Table 1 


Intercorrelations, Means, S.D.s: Four-Year College Average, High School Average, and 
Three of the Entrance Examinations (N = 182) 


Variables 
1 2 3 4 5 
1. Four-year college average = 
2. High school average 40 = 
3. General Math, Ability 50 34 = 
4. Comp. Sci. Materials Al -28 61 = 
5. Sci. Verbal Ability 30 29 n Al 58 = 
Mean 80.6 83.5 51.2 56.4 57.5 
S.D. 48 3.9 14.1 11.1 12.6 


only a slight increase over the correlation of 
0.50 between the score on the test measuring 
General Mathematical Ability and the four- 
Year college average, but a sizable increase 
Over the correlation of 0.40 between high 
School average and the four-year college aver- 
age. Adding the other tests given as part of 
the Entrance Examination (see footnote 2) to 
th e Composite Score does not bring about a 
Significant increase in this correlation of 0.53 
tween the Composite Score and the four- 
Year college average. 


Interests and Grade-Point Average 


Strong Vocational Interest Blank. The 
trong was administered to 158 of the stu- 
ents as seniors (12 in Chemical, 38 in Civil, 

in Electrical, and 41 in Mechanical Engi- 

peering), The ‘mean standard score and the 
etter grade equivalents on the scale for the 
ngineers and on the group scales are pre- 
“ented in Table 2. The composite profile for 
ese students shows a B+ on the scale for 
€ engineers and an A on the Group II scale 


ave been found 


relati bs 
b tions ranging from 0.55 to 0.70 hi ve first term 


e 
vereen G Composite Score and 
; 1 rs 
of {},,Should be mentioned that about three giae 
Schoo? students were admitted on the basis a 
on average alone, while a quarter wero a MS 
that 4), Pasis of the Composite Score. aes ee 
tive „€ test in mathematics was part o raple, 
Whe Procedure in only a small part of we cee 
Stang, S the high school average was used “ah test 
eee (either alone or in combination W! : 
th is situation explains to some a 
College lation between the mathematics test a 
Schoo} grades is higher than that betwe 

averages and college grades. 


(chemist, engineer, mathematician, and physi- 
cist). The correlations between the various 
scales of the Strong and the four-year college 
averages are low and not significant (see 
Table 2). These correlations are, of course, 
lowered to some extent by the fact that the 
academically weaker students have dropped 
out of engineering, as have many of those stu- 
dents with little or no interest in engineering. 

It was interesting to note in analyzing the 
interest pattern of the engineering students 
that 82 per cent of them obtained A or B+ 
ratings on the Group II scale, whereas Strong 
reports 77.5 per cent of his criterion group for 
Group II obtained A or B+ ratings. 

Kuder Preference Record. The Kuder 
was given to 172 of the graduating seniors 


Table 2 


Correlations between Scores on the Strong Vocational 
Interest Blank (Form M) and the Four-Year 
College Average (N = 158) 


Letter 
Grade 
Equiv- 
Scales of the Strong Mean* alent S.D. ¥ 
Individual Scale 
Engineers 42.6 B+ 9.0 i 
Group Scales 
Group I (Human Science) 41.1 B+ 15.9 03 
Group II (Technical) 46.9 A— 132 07 
Group V (Personnel) 368 B 93 09 
Group VIII (Office) 30.4 B- 96 0 
Group IX (Sales) 304 B— as 66 
Group X (Verbal) 33.8 B- 89 13 


* Standard scores. 


470 


_ Table 3 


Correlations between Scores on the Kuder Preference 
Record (Form BM) and the Four-Year 
College Average (N = 172) 


Percentile 
Equiva- 

Scales of Kuder Mean lentt S.D. r 
Mechanical 91.2 65 13.3 .16* 
Computational 40.2 66 9.0 216 
Scientific 76.9 80 10.6 fg 
Persuasive 64.5 35 13.5 —.10 
Artistic 50.5 65 125  —.18* 
Literary 498 ` 60 14.0 21** 
Musical 18.2 61 8.6 .08 
Social Service 64.8 30 160 —.09 
Clerical 42.3 21 10.9 04 


* Significant at the 5 per cent level, 
** Significant at the 1 per cent level. 
t Based on male adults (1946 profile sheet). 


(13 in Chemical, 50 in Civil, 79 in Electrical, 
and 30 in Mechanical Engineering). The 
two most relevant scales on the Kuder would 
be mechanical and scientific. The mean 
scores on these two scales were 91.2 and 76.9 
respectively (Table 3). The equivalent per- 
centile ratings would be the 65th and the 
80th, using the male adult norms presented 
in the 1946 edition of the Kuder profile sheet. 
The correlations between the various scales 
of the Kuder and the four-year ‘college aver- 
age are all low (Table 3) but a few of them 
are significant at the five per cent level and 


Louis Long and James D. Perry 


one at the one per cent level.° These corre- 
lations are, of course, lowered by the same 
factors mentioned in connection with the cor- 
relations between scores on the Strong and 
grades. 


Kuder Freshman-Senior Correlations 


Thirty-two students who took the Kuder 
during their senior year also took it during 
their freshman year. In comparing the mean 
scores (Table 4) the only difference that is 
statistically significant is that for the scientific 
scale (mean of 81.8 as freshmen; mean of 73.5 
as seniors). 

The correlations between the two sets of 
scores vary considerably (Table 4) from one 
scale to another (—.22 to +. 66). These cor- 
relations should be thought of not only as an 
index of reliability but also as an index of 
stability of interests over a four-year period. 

Finding only a limited relationship between 
the ratings on the interest questionnaires and 
academic grades is what one would expect 02 
the basis of other studies (1, 3, 7, 9; for 
summaries see 11 and 12). Of course, in 4 
counseling situation the interest question- 
naires are used with the idea of obtaining in- 
formation about the interest pattern, not with 


°It is interesting to note that the only scale with 
an r significant at the one per cent level is the 
Literary Scale, which is the same one Yum (14) 
found to be significantly related to grades made by 


the men in his study. 
Table 4 
Correlations between Scores on Kuder Obtained during Freshman and Senior Years (N = 32) 
Percentil 
Mean Score Equivalent* S.D 
Scales of Kuder Fresh. Sr. Fresh. Sr, Fresh. S r 
zi x Tes. r. 
Mechanical 85.8 90.3 
Computational 34.7 36.8 x S 178 ee E 
Scientific 818 735 s 2 Ws tly ‘31 
Persuasive 65.3 67.6 37 41 A 124 27 
Artistic 55.0 538 7 g 2o Ml ‘66 
Literary 43.5 46.9 42 51 13.8 142 35 
Musical ' 19.8 22.0 70 75 13.6 11.9 49 
Social Service 67.1 62.0 35 2 83 oe ‘37 
Clerical 23 378 4 A ee a —22 
f 12. i 

* Based on male adults (1946 profile sheet), 


ii 


Academic Achievement in Engineering 


the idea of getting information that will help 
to predict academic grades.” 


Summary and Conclusion 


Using a weighted grade-point average based 
on four years of college work as a criterion 
the results of this study indicate that the 
Selection of freshman engineering students 
can be improved by the use of both high 
School averages and test scores. The effec: 
tiveness of the following tests were investi- 
gated: Scientific Verbal Ability, Comprehen- 
sion of Scientific Materials, and General 
Mathematical Ability. : 

The correlations found between two inter- 
est questionnaires (Strong and Kuder) and 
college grades are not high enough to wake nit 
the inclusion of ratings on such questionnaires 
in a selection battery, but yet it is felt that 
Such instruments are useful in an individual 
Counseling situation. 


Received January 15, 1953. 


References 


L Berdie, R, F. The prediction of college achieve- 
ment and satisfaction. J. appl. Psychol, 1944, 
28, 239-245 ‘ 

2. tratoa, a and Burnham, P: S. Forecast- 
ing college achievement; Part 1: General A 
siderations in the measurement oj acade ie 
promise. New Haven, Conn: Yale Univ. 
Press, 1946, 


7 See discussion by Strong (10. pP- 17-19). 


in 


~ 


10. 


iL 


471 


. Holcomb, G. W. and Laslett, H. R. A prog- 
nostic study of engineering aptitude. J. appl. 
Psychol., 1932, 16, 107-115. 

. Kandel, I.L. Professional aptitude tests in medi- 
cine, law, and engineering. New York: Teach- 
ers College, 1940. 

. Long, L. and Perry, J. D. Entrance examina- 
tions at the City College of New York. 
Educ. psychol. Measmt., 1947, 7. 765-772. 

. Lord, F., Cowles, J. T., and Cynamon, M. The 
Pre-Engineering Inventory as a predictor of 
success in engineering colleges. J. appl. Psy- 
chol., 1950, 34, 30-39. 

. Melville, S. D. and Frederiksen, N. 
ment of freshmen engineering students and 
the Strong Vocational Interest Blank. J. 
appl. Psychol., 1952, 36, 169-173. 

. Moore, J. E. A decade of attempts to predict 
scholastic success in enginecring schools. Oc- 
cupations, 1949, 28, 92-96. 

. Phillips, W. S. and Osborne, R., T. A note on 
the relationship of the Kuder Preference Rec- 
ord Scales to college marks, scholastic apti- 
tude and other variables. Educ. psychol. 
Measmt., 1949, 9, 331-337. 

Strong, E. K.. Jr. Vocational interests of men 
and women. Stanford, Calif.: Stanford Univ. 
Press, 1943. 

Stuit, D. B., et al. Predicting success in projes- 
sional schools. Wash., D. C.: Amer. Council 
on Educ., 1949. 

. Super, D. E. Appraising 

N. Y.: Harper, 1949. 

. Vaughn, K. W. The Pre-Engineering Inventory. 
J. engng. Educ., 1944, 34, 615-625. 

. Yum, K. S. Student preferences in divisional 
studies and their preferential activities, J. 
Psychol., 1942. 13, 193-200. 


Achieve- 


vocational fitness. 


Tue JOURNAL or APPLIED PSYCHOLOGY 
Vol. 37, No. 6, 1953 


Study of Values Profiles Adjusted for Sex and Variability 
Differences 


Julian C. Stanley 


Department of Education, University of Wisconsin 


In 1951 the 20-year-old Allport-Vernon 
Study of Values, a scale for measuring evalua- 
tive attitudes, was revised by Allport, Vernon, 
and Lindzey (1). Especially, its social scale 
was altered considerably in an attempt to 
secure greater homogeneity. 

As originally, the average score of the 
standardization group on each of the six 
values has been equalized, now approximating 
40 for men and women combined. Marked 
systematic sex differences with respect to both 
means and standard deviations remain, how- 
ever. The women are more religious, aesthe- 
tic, and social; the men are more theoretical, 
political, and economic. The range of the 12 
means listed in the Manual of Directions (1) 
is nearly seven raw-score units, while vari- 
ances go from 37 to 111. 

Because each testee has exactly 240 points 
to allot, there is no individual profile level, 
since the mean of his six value scores must 
be 40. The revised booklet supplies only one 
norm profile, employing a mean of 40 for any 
value for either sex. 

If a profile is to be used at all, it seems 


desirable to remove the group level factors— 
mean and standard deviation—separately for 
each sex. This has been done in Table 1 for 
the 1,816 college students who make up the 
general norms. Note there, for example, that 
a raw theoretical score of 53 has the same 
standard-score meaning for men as a theo- 
retical score of 46 has for women. Further- 
more, 43 on theoretical is for men equivalent 
to 37 on aesthetic. P 
Two cautions are appropriate here. First, 
Table 1 is based upon national norms am 
may therefore be somewhat imprecise in cer- 
tain local situations. For example, when 
scanning profiles we should remember that 
both women and men in the Southeast may 
tend to score higher on the religious scale 
than do those in other sections (2, 3). Second, 
as the Study of Values authors warn (1); * 
“high” score is high in an interindividual 
sense only if comparisons are made among 
persons who can reasonably be expected Ue 
have the same average value level. There- 
fore, interpretations should usually be ©” 


fined to the relative prominence of ie 
Table 1* 
Centile Sheet for College Men and Women o1 
2 n Allport- -Li 
Note: Based upon the Norms (851 Men, 965 Wainer) bi a ete 
Theoretical Economic Aesthetic Social Political Religio" 
Centile M wW M w M Ww M w m. M i 
4 M W 
90 53 46 54 48 50 5. fu 
0 53 5 z 
Boh £4 22 GS sa Ge 
25 38 31 , = 7o on 3 4 Cr a. 
10 3427 7 = Se 3 3 e 0 g 
30 29 235 31 n y ao a m 2 
34 30 
* This is an abbreviated table. To red iat 
Documentation Institute. Order Docum nt SOG the original table ha i i = Amean 
Service, Library of Congress, Washington one from AD uxiliary Pub reppin E ith toduplicsti yd 
35 mm. motion picture film) or $1.25 for photocon: eutting $1 images 1 inc $ 


photocopies (6 


472 


-25 for microfilm (im i i n sta 
i ages 1 inch high © 
X 8 inches) readable without optical aid. 


AN 


“Study of Values” Profiles 


individual values, since the inter-individual 
Meaning of either raw or standardized level- 
free scores will not be clear when heterogene- 
Ous groups are involved. 

Table 1 is merely a statistical attempt to 
rid the scale of certain inequalities that seem 
to make intra-individual comparisons less 
precise. 


Received January 26, 1953. 


2 


3 


References 


473 


. Allport, G. W., Vernon, P. E., and Lindzey, G. 
Study of Values: a scale for measuring the 


dominant interests in personality. 
manual of directions. 


flin, 1951. 


. Gray, Susan W. 


college 


women, white and 


Psychol., 1947, 25, 239-241. 


. Stanley, J. C. and Gray, Susan W. 


ences 
types. 


and self-insight on 
Mimeographed, 1951. 


Spranger’s 


Booklet and 
Boston: Houghton Mif- 


A note on the values of southern 
Negro. J. 


soc. 


Sex differ- 


value 


THE JOURNAL or Appiirp PSYCHOLOGY 
Vol. 37, No. 6, 1953 


A Scale for Measuring Work Attitude for the MMPI 


Mary Tydlaska 


Columbia-Southern Chemical Corporation, Lake Charles, Louisiana 


and Robert Mengel 


Lake Charles, Louisiana Air Force Base 


The Minnesota Multiphasic Personality In- 
ventory (3) is one of the most recent and 
among the best of personality inventories. 
It is designed to measure many aspects of 
personality by scoring various combinations 
of items. Although this scale has found great 
use in its present form in clinics and hos- 
pitals, it has not been extensively used by in- 
dustry in pre-employment testing. For this 
latter purpose it was thought desirable to 
determine if there were items which could 
distinguish between individuals whose per- 
sonality organization expresses desirable atti- 
tudes toward work and good motivation to- 
ward it and individuals whose work attitude 
is notoriously poor. 


Selection of Subjects 


Two groups of subjects were studied. Two 
examples of work attitude were available. 
The 50 subjects from the Columbia-South- 
ern Chemical Corporation in Lake Charles, 
Louisiana are current employees who were 
given the MMPI in a program of pre-employ- 
ment testing which preceded their employ- 
ment. Only those employees who had com- 
pleted two or more years of satisfactory work 
performance were included in this study. 
Satisfactory work performance was based on 
merit ratings given semi-annually and a mean 
score of 3 (defined as ‘satisfactory’) was used 
as the criterion. 

The 60 air force ‘poor work attitude’ sub- 
jects whose MMPI records were used in this 
study were male white air force service per- 
sonnel in the 806th Supply Squadron at the 
Lake Charles, Louisiana Air Force Base. The 
category ‘poor work attitude’ represents 43 
A. W. O. L. cases, 7 disciplinary problems, 8 
individuals suspected of malingering, and 2 
miscellaneous cases. i 


The senior writer served, during the sum- 


mer of 1952, as a consultant in administering 
and interpreting a battery of tests designed to 
aid the commanding officer in working with 
these and similar individuals. An evaluation 
of ‘poor work attitude’ for each of these 60 
cases was made on the basis of one or more 
interviews and test data, including a sentence 
completion test and the MMPI r 
The groups of air base ‘poor work attitude 
cases and ‘satisfactory work attitude’ em- 
ployees were matched for certain items of 
biographical data. These variables include 
intelligence (an Otis IQ for the industrial 
employees and an Airman’s Qualifying Exam 
score for the air base personnel), age, educa- 
tion, general occupational level, and marital 
status. The typical subject was about 27 
years of age, had average intelligence, had 
completed the eleventh grade of school, and 
was more likely to be married than single. 


Purpose 


The original purpose of this study was t° 
utilize the 60 ‘poor work attitude’ air base pe" 
sonnel MMPI scores as criteria to evaluate @ 
number of MMPI items previously selected oP 
an a priori basis by seven individuals in th® 
field of personnel selection and testing, as Te?” 
resenting information indicative of an aP” 
Plicant’s work attitude. This group of Mee 
experts was composed of three individuals 
teaching courses in industrial psychology a 
associated with a college or university. he 
remaining four judges were personnel of ey 
ployment managers with 15 years mean €% 
perience in personnel selection. ing 

These items were selected in the follow 
Way. Each judge was asked to indicate 0” ê 5 
MMPI group score sheet those statemen ® 
and their deviant response which would ar i 
him insight into the general motivational Prot 
tern and work attitude of an applicant 


474 


oe 


— 8 


Scale for Measuring Work Attitude for the MMPI 


employment. All items which were selected 
by as many as three out of seven judges were 
included in the experimental form of the 
Work Attitude Scale. 

This preliminary screening of MMPI items 
Was undertaken to eliminate a number of an- 
ticipated items such as those which were 
found most valid in screening A. W. O. L. 
recidivists, “excessive use of alcohol, misbe- 
havior in school, trouble with the law . . .” 
(1, p. 231). The writer was aware of the 
Possibility that such items might be found 
to discriminate but postulated that they would 
hot contribute the specific type of informa- 
tion which would be most valuable in helping 
an employment manager gain insight into a 
Potential employee's subsequent work atti- 
tude. Most items of this nature were not 
Selected by three or more judges and, thus, 
Were not included in the experimental Work 
Attitude Scale. A total of 58 items composed 

€ experimental scale. A 

further consideration for eliminating 
items not selected by three or more judges 
Was the desire to establish a number of items 
Which would be of practical value and specific 
Interest to employment managers in pre-em- 
Ployment testing, Deviant responses to these 
Selected items ai be read individually, and 
€Y can then be evaluated subjectively as 
Well as quantitatively scored. , 
technique designed to contribute more 


3 4 Py 
Meaningful information from a normal MMI 


475 


profile would have great utility in aiding a 
non-clinically oriented personnel staff mem- 
ber to evaluate applicants from the stand- 
point of their potential adjustment to an em- 
ployment environment. The over-all design 
of this study was to provide for pre-employ- 
ment testing an MMPI Work Attitude Scale 
tailor-made for that specific purpose. 


Procedure 


An examination of the sub-scales in terms 
of their relationship to the two groups of in- 
dividuals included an inspectional analysis of 
the MMPI profiles. This inspection was con- 
ducted in order to determine the number of 
profiles classified as normal and the number 
having T scores at 70 on one or more sub- 
scales. Table 1 presents the results of this 
inspectional analysis. A comparison was also 
made of the previously selected individual 
MMPI items and both groups were scored 
for these individual items. 


Results 


Significant differences were found between 
the profiles of the ‘poor work attitude’ in- 
dividuals and the ‘satisfactory work attitude’ 
employees. For example, 43 (or 71.7 per 
cent) of the 60 ‘poor work attitude’ cases had 
one or more T scores of 70 while only 9 (or 
18.3 per cent) of the 50 ‘satisfactory work 
attitude’ employees had one or more scores 
of 70 or more. 


Tib'e 1 


Inspectional Analysis 


>] Profiles of D 
of MMPI Pr sitet Ate B 


atisfactory Work Attitude’ Employees and 


ase Service Personnel 


“Poor Work Attit 


‘Satisfactory Wi 


Employees 


a ork Attitude’ 


‘Poor Work Attitude’ Air Base 
Personnel 


> Cumulative 


Cumulative 


a x % N % N % 
š % h 
Ka — ites : rr 92 17 28.4 17 28.4 
‘i. ne’ 82 i 98 15 25 32 53.4 
e Tis 16 2 6 10 38 63.4 
2, Sale 70 or over 8 50 100 s = 
Subs 1 2 8 13.3 46 76.7 
3 Scales 70 or over i 3 
S Scales 70 à 10 16.7 56 93.3 
ESubs BE ONG 4 6.6 60 100 
5 Sub Scales 70 or over = ae 
Scales 70 or over = = 60 100 
so 100 = 


476 
Table 3 
Scores on a Tentative Work Attitude Scale 
i Frequency of Frequency of 
Satisfactory Poor wea 
Tentative Work Attitude 
Work Attitude a Bee 
Attitude Employees ersonne 
Scale (N = 50) (N = 60) 
25-29 0 8 
20-24 1 15 
15-19 2 8 
10-14 7 12 
5- 9 25 1 
0- 4 14 0 
Mean 7.0 16.4 
S.D. 4.1 7.4 


Only 37 items, from the 58 previously se- 
lected by three or more judges, were found to 
distinguish between ‘poor work attitude’ in- 
dividuals and ‘satisfactory work attitude’ em- 
ployees at the .01 level of confidence. The 
previously selected items in the experimental 
scale with the highest chi-square values were 
then combined and are presented in Table 2! 
as a Work Attitude Scale. The items are ar- 
ranged in the following order: rank order in 
differentiating ability, the MMPI booklet 
number of the item, the deviant response, the 
MMPI item, the number of each group giv- 
ing the deviant response, and the chi-square 
value attached to the deviant response. 

The MMPI’s of the two groups were re- 
scored in order to obtain the score each in- 
dividual in the two groups made on this Work 
Attitude Scale. The distributions for these 
groups are presented in Table 3. A.compari- 
son of the scores was made. The number of 
responses on the Work Attitude Scale for the 
‘poor work attitude’ cases in the validation 
group ranged from 5 to 29 (Mean 16.4, SD; 
7.4) while scores for the ‘satisfactory work 


attitude’ employees ranged from 3`to 20 
(Mean 7.0, S.D. 4.1). 


1 To save printing costs, a 3-page t isti: 
37 items in the Work Atti Seale tng sting the 


Mary Tydlaska and Robert Mengel 


A cut-off score was established where the % 
number of mis-identifications reached a mini- 
mum. Using a cut-off score of 13, 15 per cent 
of the ‘poor work attitude’ cases and 12 per 
cent of the ‘satisfactory work attitude’ em- 
ployee group were incorrectly identified. f 

Admittedly, the items in Table 2 comprise 
a tentative scale which requires further vali- 
dation. Until comparative studies have been 
carried out, the writer wishes to emphasize 
the experimental nature of this scale. There , 
is a possibility that work attitude may not be 


a general factor but rather may be highly 
specific to particular work situations. Some 


attrition of items could then be expected in 
cross validation. 

The writer plans to subject these items to 
further validation by studying the MMPI 
scores of a group of men whose work perform- 
ance and work attitude at Columbia-Southern 
have been consistently merit rated as ‘more 
than satisfactory’ and a group of men — 
nated because they were either dissatisfied 
with their assigned work or working condi- 
tions. Further study with a freshman college 
population is also planned. 

An interesting generalization, however, gan 
be made from the writer’s experience in i 
dividually re-reading and scoring each devian 
response. An unusually large proportion ° 
‘poor work attitude’ individuals expressed con- 
cern over their bodily functions and belier 
that they were not in good health. As t k 
was almost a chronic complaint, it sugges 
that a relationship exists between the Hy! ee 
chondriasis Scale and the proposed Work 
titude Scale. her 

The writer believes, however, that furt y 
validation of this scale would prove definite 
advantageous for the purpose of screening ee 
those individuals whose Work Attitude Ad A 
suggests that they are poor risks for emp an 
ment. The problem of probable risk ere 
important one in an employment Lae 
Some of the resulting consequences of a P . 
work attitude are: (a) loss of produs jn 
time; (b) loss of time and effort expend? in 
training a poor worker; and (c) negativ tlw i 
fluence of a low morale worker on fë 
workers. 


£3 — et 
a— =op - 


Scale for Measuring Work Attitude for the MMPI 


If an applicant is hired for permanent em- 
ployment, it should be with the knowledge 
that his work attitude will enable him to con- 
tribute positively to the demands of the work 
Situation and environmental needs of his co- 
Workers. It may be that such a short scale 
could have wide use in screening applicants in 
the pre-employment situation. 


Summary 


From 58 MMPI items originally selected 
by three or more judges working in the area 
of personnel selection and testing as topres 
Senting insight into a potential employee's 
mner motivation and work attitude, 37 items 
Were found to distinguish at the .01 level of 
Confidence between a group of 60 male white 
‘Poor work attitude’ air force personnel and 
4 group of 50 ‘satisfactory work attitude’ in- 
‘ustrial employees equated in terms of educa- 
tion, sex, intelligence, age, occupation, and 
Marital status. 


477 


When the 37 items which distinguish at the 
highest level of confidence are combined into 
a scale with unit weights and using 13 as a 
critical score, a Work Attitude Scale is ob- 
tained which correctly identified about 85 
per cent of ‘poor work attitude’ cases and 88 
per cent of ‘satisfactory work attitude’ em- 
ployees. 


Received January 26, 1953. 


References 


1. Clark, J. H. Application of the MMPI in dif- 
ferentiating A.W.O.L. recidivists from non- 
recidivists. J. Psychol., 1948, 26, 229-234. 

2. Gough, H. G., McClosky, H., and Meehl, P. E. 
A personality scale for dominance. J. ab- 
norm. and soc. Psychol., 1951, 47, 360-367. 

3. Hathaway, S. R. and Meehl, P. E. An atlas for 
the clinical use of the MMPI. Minneapolis: 
Univ. Minn. Press, 1951. 

4. Wiener, D. N. Subtle and obvious keys for the 
MMPI. J. consult. Psychol., 1948, 12, 164- 
170. 


THE JOURNAL or APPLIED PSYCHOLOGY 
Vol. 37, No. 6, 1953 


The Effects of Experience and Change of Job Interest on the 
Kuder Preference Record ' 


Frederick Herzberg 


Psychological Service of Pittsburgh 


and Diana Russell * 


University of Pittsburgh 


Valid occupational interest patterns pro- 
vide one of the major bases of vocational 
counseling. The usefulness of such profiles 
depends largely upon the nature of the sample 
used for their construction. Two factors. 
success and satisfaction of employees in a 
field, have been shown to affect the form of 
such profiles. 

A study by Hahn and Williams (4) with 
Marine Corps women revealed that certain of 
the Kuder scales distinguished satisfied from 
dissatisfied clerical workers. DiMichael and 
Dabelstein (2) added to the findings by re- 
cording significant relationships between the 
degree of satisfaction with their employment 
expressed by vocational rehabilitation coun- 
selors and their interests on the Kuder Prefer- 
ence Record. In a report by Barnette (1) 
occupationally successful and unsuccessful 
counseled veterans were distinguished on the 
basis of their Kuder measured interests, 

Two other possible influencing factors 
which may alter the values of occupational 
interest profiles are the experience on the job 
and the lack of major interest in other fields 
of individuals in the sample on which a par- 
ticular vocational pattern is based. These 
factors have been little considered in the de- 
velopment of occupational Kuder preference 
norms. Slight evidence exists as to the simi- 

_ larity of interest patterns between experi- 
enced and inexperienced workers in various 
occupations, If the profiles of these two 
groups differ, the practice of utilizing experi- 
enced persons as the basis for vocational 
counseling is seriously handicapped. 


1 This research was su 
Buhl Foundation. 
? Miss Russell is now on the staff of the Depar 
5 fs 
ment of Child Study and Research, Sch par 
of the City of Erie, Pa. ool District 


pported by a grant from the 


The desire to remain in the same occupa- 
tion may appear to be similar to satisfaction; 
however, many people who do not express 
discontent for their present occupation may 
nevertheless show a preference for employ- 
ment in some other area. How much of 
effect such an expression has on occupationa 
norms has not been determined. dos 

It was the purpose of this study to examine 
the similarities and differences on Form BI 
of the Kuder Preference Record: (a) between 
individuals experienced in various occupations 
and persons entering these same occupations, 
and (b) between individuals expressing an ne 
terest in an occupational area other than tha 
in which they are experienced and those per 
sons in the same field who profess no other 
occupational interest. 


Method ‘ 
Psychological Service of Pittsburgh, in od 
ferent phases of its services, has accumulate 
Scores on the industrial form of the Kude 
Preference Record for various occupations: 
All subjects were male adults whose interests 
and abilities have been measured as pet i 
a larger testing program to select persons 1° 
promotion or employment. b- 
At the time of an initial interview, the SU e 
jects were asked to indicate the nature of a 
work in which they were presently snot 
as well as the positions for which they a 
applying. The members of each occupation 
group were then classified as: (a) nicer 
the vocation for the first time (entry grouk ne 
(b) having had previous experience 19 en 
area in which they were seeking employ” 
(experienced group): or (c) seeking emP ich 
ment in a field other than the one in WP, 
they were experienced (other interest grou e 
Subjects representing five occupations “ 


478 


Experience and Change of Job Interest 479 
Table 1 
Breakdown of Sample Studied 
Occupation 
Engineering Sales __ Laboratory Managerial Laborng g 
DOT 0X74 DOT 1X55 DOT 0X70 DOT 0X84 DOT 6X669 
Entry (Inexperienced) 131 36 12 t =. 
Experienced 123 82 29 27 49 
Experienced with other 
Job Interest 12 28 = 20 39 
(sales) (various)* (various)** (machinist) 


Cr : 
rafts E Laboring Jobs 8. 
Sales 9, Engineering 


chosen for sampling: engineers, salesmen, pro- 
duction managers, laboratory workers. and 
aborers, The samples used in this study 
With the various breakdowns are shown in 
able 1, 
Means and standard deviations of each in- 
terest scale were computed for all sub-groups 


o A sae 
a €ach occupational area. The significance 


Ta A difference was accepted as significant 
e “C value was beyond the 1% confidence 


Results 

ha Engineers. Entry engineers are found to 
S a higher mean on the Mechanical and 
4 ee scales and a lower mean on the 

Is 4 all hed 
p Sical scale than experienced engineer 

7 ne i à 
nine eers seeking sales positions have sig 
ed higher Persuasive interests than ex- 

lenced engineers. See Table 2. A. 

a lesmen, Entry and experienced A es- 
imi i r ‘ith no 
similar interest profiles a : 

c Cant mean differences between themi: 
ions] á i i i ' occupa- 
tion, Men applying for jobs in other pa 


x + + . r- 
areas record a significant drop 1m Pe 
dard deviation 
the 


a 

Sua, 
asive g i ; 

for € Scores. The larger stan 


Meg} Ne “other interests” sales group OP 5 

of p Pical scale is understandable in o 

© variety of other occupations included. 

Table 2, B. 

eny ratory Workers. 

Work p Orkers with expe 
S reveals a significantly 


Sea 
A comparison of 
rienced laboratory 
higher Scien- 


* H . S . . z e N A 
Journalism 1, Business Relations 4, Engineering 5, Office Management 1, Routine Recording 1, Structural 


4, Office Management 3, Drafting 1, Mechanical Repair 1, Machinist 2. 


tific mean for the entry group. See Table 


2,€. 

Production Managers. A higher Persuasive 
mean is found for production managers with 
other occupational desires when compared 
with production managers seeking employ- 
ment in the same area. Almost one half of 
the “other interest” group were sales appli- 
cants. See Table 2, D. 

Laborers. The Mechanical and Scientific 
means for laborers with machinist ambitions 
are significantly larger than the correspond- 
ing means for laborers desiring to continue in 
laboring jobs. The variances between the 
two groups on the Mechanical scale are sig- 
nificantly different but it is unlikely that the 
“t? ratio is produced entirely by the differ- 
ences in variances (3). See Table 2, Æ. 


Discussion 


Two generalizations are suggested by the 
results presented. With respect to the effect 
of experience on Kuder occupational norms, 
the interest patterns of entry and experienced 
workers are essentially similar. The entry 
groups are often characterized by higher 
scores on those interest scales which particu- 
larly belong with their vocational fields. 
Thus, it was shown that entry engineers were 
significantly higher on the Mechanical and 
Scientific scales and entry laboratory workers 
higher on the Scientific scale than were their 
experienced counterparts. These higher scores 
for the entry group may stem from their 
recent completion of training and their pre- 


480 


Frederick Herzberg and Diana Russell 


Table 2 


A Comparison of the Means and Standard Deviations between the Competitive Groups 


Groups Mech Comp Sci Pers Art Lit Mus SSer ple 
` A. Engineers e 
Entry Engineers (M) 541 294 41.0 403 198 178 9.0 cae m 
(N = 131) (c) 9.3 79 ho QI 7.9 8.4 59 109 x 
+ ae ae 
Experienced Engineers (M) 488 30.9 381 424 204 188 11.0 414 cil 
(N = 123) (c) 92 82 86 139 79 74 66 11.6 o 
* 
2 
Engineers with Sales Interests (M) 47.7 27.3 378 566 186 239 128 348 a 
(N = 12) (o) 7.9 6.8 7.8 9.7 7.6 wa 5,5 11.3 x 
B. Salesmen 59 
Entry Salesmen (M) 36.6 216 320 61.9 18.5 21.6 16.0 42.0 a 
(N = 36) (o) 10.6 66 11.5 8.6 8.3 6.9 6.0 8.9 ‘ 
Experienced Salesmen (M) 384 22.7 30.2 620 28 190 135 41.9 a 
(N = 92) @ 130 84 96 95 75 78 67 108 1 
** 
Salesmen with other Job Interests (M) 368 238 315 532 20.7 20.7 12.7 43.4 Pe 
(N = 28) () 185 92 98 9g 70 81 66 10.9 1 
C. Laboratory Workers 4 
Entry Laboratory Workers (M) 481 272 528 423 192 160 111 41 ei 
(N = 12) @) 67 18 52 141 102 66 49 70 ” 
** 
Experienced Laboratory Workers (M) 468 28.7 461 40.3 20.0 20.2 13.7 37.2 a 
(N = 29) © 87 81 77 158 73 92 57 19 ® 
D. Production Managers 
Production Managers (M) 529 251 410 36.1 21.3 161 11.3 40.7 a 
N = 27) @) 89 108 72 99 g1 101 6g 96 ™ 
** 
Production Managers; other Job (M) 50.7 26.2 35.1 48 a) 349 
: . f O 191 18.2 11.2 425 
Interests (N = 20) (c) 85 74 83 127 73 fe 69 108 93 
E. Laborers 4 
E ie 494 264 353 372 m6 173 99 425 a 
N= o) 16 82 ? r 3 9.5 Á 
z 81 115 104 s7 70 9 
Paborera Desiting Mites (Mf STA gos ig gig Ce we o we 355 
(N = 39) (e) 62 54 96 9.2 80 62 62 9.1 a 
** The difference between adjacent means is significant at the 1% level of confid 
o onhdence. 
occupation with the subject matter in the 
ees am on yo 
characteristic areas. In addition, it is highly fee os ai Such change depote seek 
probable that the results reflect a slanting of ing ne Nak desired by the 3a oa typi ally 
responses toward the desired occupational have 5 SCEUpa UH is. -Engineer Iechanic? 
choice. Applicants for jobs are apt to alter predominant interests on the M gu 


or modify their responses to produce what 
they feel are the desirable interest patterns 
A change of vocational i 


, 0 goals is reflecte: 
in the Kuder interest scores. The nature a 


and Scientific scales with an average P® eet 
sive interest, For those experienced engi i6 
desiring sales work, the dominant intet? go- 
in the Persuasive area, A similar S tio” 
ward Persuasive interest occurs for prO 


rm 


ee = 


Experience and Change of Job Interest 


Managers desiring other types of jobs. Al- 
Most half of the “other interest” group were 
sales applicants and their weight in the group 
accounted for the significant change in this 
scale. Contrariwise, experienced salesmen 
seeking employment outside of sales work 
Show lower scores in Persuasive interests. 
This trend is also observed when comparing 
laborers with machinist ambitions with labor- 
ers content to fill laboring jobs. In this in- 
stance the groups are decidedly differentiated 
in the expected area of Mechanical interest. 
he explanation of consciously biasing re- 
Sponses toward the desired area mentioned 
before would apply more specifically to the 
other interest” groups. The reason for de- 
Sting to change job areas is not known. 
'‘Ssatisfaction with their present vocation 
could possibly have been the deciding factor 
with many persons in the “other interest’ 
Stoups, Nevertheless, the factor of other job 
desires does become an important variable 
n the selection of samples for occupational 
norms, 


Summary 
L. This study was designed to determine 
the similarity of Kuder interests between: 
(a entry and experienced workers; and (b) 
“XPerienced workers and experienced workers 
‘th new occupational goals. 
ke The interests of the entry groups äte 
äsically similar to those of experienced per 
eiS in the same occupation. The differences 
g are in the direction of higher scores te 
© entries on scales typical of the occup 


481 


tional area. This similarity lends validity 
to the practice of using interest profiles based 
on experienced workers for vocational coun- 
seling. 

3. It has been shown that Kuder interest 
scores of persons seeking employment in a 
new area differ from persons in similar oc- 
cupations who choose to remain in their pres- 
ent vocational field. The particular scale in 
which the differences occur follow the type of 
work to which the change is being made. 
Though definite conscious slanting of test re- 
sponses occurs in a situation in which em- 
ployment is involved and the reason for many 
of the job changes may have been dissatisfac- 
tion with their present type of work, the dif- 
ferences found do suggest the importance of 
no other job interest as a criterion in the selec- 
tion of samples for determining occupational 
interest norms. 


Received January 26, 1953. 


References 


1. Barnette, W. L., Jr. Occupational aptitude pat- 
terns of selected groups of counseled veterans. 
Psychol. Mon., 1951, 65, No. 5 (Whole No, 
322). 

2. DiMichael, S. G. and Dabelstein, D. H. Work 
satisfaction and work efficiency of vocational 
rehabilitation counselors. Amer, Psychol., 1947, 


2, 342-343. (Abstract) 

3. Fisher, R. A. Statistical methods for research 
workers. (7th Ed.) London: Oliver and 
Boyd, 1938. Pp. 129-130. 


4. Hahn, M. E. and Williams, Cornelia T. The 
measured interests of Marine Corps Women 
Reservists. J. appl. Psychol., 1945, 29, 198- 
211. 


Tue JOURNAL or APPLIED PSYCHOLOGY 
Vol. 37, No. 6, 1953 


Effect of Viewing Angle and Parallax upon Accuracy of Reading 
Quantitative Scales * 


Jerome Cohen 


Antioch College 


James M. Vanderplas and William J. White 


Aero Medical Laboratory, Wright-Patterson Air Force Base, Ohio 


An important condition which affects the 
accuracy of reading instruments that present 
quantitative information is the orientation 
of the instrument with respect to the ob- 
server’s line of sight. When an instrument 
is displaced laterally from the point im- 
mediately in front of the observer, the view- 
ing angle is decreased, and if the pointer and 
the plane of the dial are also displaced paral- 
lax is introduced. It is well known that de- 
creases in viewing angle and the introduction 
of parallax affect reading accuracy. Manu- 
facturers of precision instrument scales take 
considerable pains to eliminate these factors 
on scales which are to be read to close tol- 
erances. In more common situations, how- 
ever, where several instruments are displayed 
on a flat panel, as in aircraft, it is not feasible 
to construct instruments with precise pointer- 
locating devices on them, such as mirrors, etc. 
The usual design practice in such situations 
has been to restrict the location of instru- 
ments that require great reading accuracy to 
the center of the instrument panel, thus avoid- 


ing the problem of parallax for at least some 
instruments. 


Whether this latter practice is necessary 


or desirable has been the topic of discussion 
by several investigators concerned with in- 
strument dial legibility and design practice. 
Recommendations by Calvert (3) and Du 
Bois (4) Suggest that when ai 

ments must be dis 


pendicular to the li 


e ne of sight. That this 
kind of arrangemen 


t creates new difficulties 
* The experiments reported here were performed at 


Antioch College, Yellow Sprin; i i 
Force Contract No. AF 18(600) s0 n° nen R 


has been pointed out by Barr (1), who, vi 
agreeing that readability could be increase 
by tilting the dial faces or curving the instru- 
ment panel, mentions that space requirements 
often make it impossible, lighting and renec 
tion problems are more difficult, and we 
hazards are created (e.g., fouling during 
emergency escape procedure). Kappauf pel 
however, suggests that for instrument panel's 
where space limitations and other factors 
are not a serious consideration, panel shape 
and instrument dial orientation might well be 
a subject of careful study. se 
It appears to be generally agreed that ie 
cessively oblique views of instrument ‘nat 
faces create serious reading errors and ve 
such situations should be avoided in t ë 
design of instrument panels. But while on 
conceptions, based upon experience with oe 
and panel design, are sound, very little e 
pirical evidence or theory exists to define ae 
cisely the limits of reading accuracy ear 
might be expected as a function of vien at 
angle and parallax. Some data on the a 
have been accumulated by Bartlett and 5 ar > 
worth (2) ina study of errors made in os 2 
ing aircraft position when represented = i 
plotting board grid. These investigators os 
lected extensive data on the number of at 
location errors made when the plotting b and 
was seen from various viewing angles that 
distances. But while these data suggest ica 
decreasing viewing angle beyond a Tyros 
point (about 35 degrees) results jca je 
errors, the results are not directly app the 
to an instrument reading task, due cor di 
nature of the apparatus used and the 
tions of their experiment. lett av 
It would appear from the Bartlet! can 
Mackworth data that reading accuracy a 
be expected to decline systematically 


482 


`$ 


Efiect of Viewing Angle and Parallax 


DIAL TYPE ! 
600 X 10 


Fic. 1. 


function of decreasing viewing angle and that 
Serious limitations on instrument location need 
to be imposed if reading accuracy is not to 
Suffer, Tt appears desirable also to determine 
What changes in accuracy could be expected 
in the more special case of instrument dials, 
© order to discover if any generality exi Ve 
or present empirical findings or for certain 
Postulated invariants. It is the purpose of 

'S Paper to report on two exploratory 
Studies designed to determine the changes 
m reading accuracy which might be expected 
a occur as a function of decreasing viewing 
angle and the introduction of parallax. 


The Experiments 
ee he first experiment was designed to par 
Ne the effects of changes in viewing angle 


. . -allax S 
on reading errors without parallax effect 


“tering into the situation. Photographs of 


‘als were used to rule out the effects 0 
Parallax, The photographs were presented 

è tachistoscope and read at various View- 
8 angles from 90 degrees to 25 degrees. ¢ 


tot: P 
Mtn of 20 college students were me 
Sn ects in the experiment. All had n 


5 taal 
À ellen acuity and none had obvious visua 
lefectg 9 


idi irror 
he 4pparatus consisted of a sliding = z 
th 'Stoscope with a switching mechanism 
time PE subject could control the exp “ 
m The back of the tachistoscope : 
"Bed so that the photographs co" 

i g med 
4 i yewing angle we mean the acute nee i 
ais ObserCtsection of the plane of the dial fa 

Crver's line of sight. 


DIAL TYPE 2 
400 X 10 


Dial types used in experiment I, 


tilted either horizontally or vertically and 
presented at all viewing angles between the 
limits used. Ten of the subjects read the 
dials tilted horizontally, and ten read them 
tilted vertically. Viewing distance was 28 
inches, and the brightness of the white parts 
of the dials was seven foot-lamberts. 

Two kinds of dials, as shown in Figure 1, 
were used in the experiment. One was a 
600 X 10 dial, and the second was a 400 x 
10 dial. Each subject was given ten practice 
trials to familiarize him with the dials and 
the apparatus. He was instructed to read the 
dials to the nearest five units as accurately 
and quickly as possible. Each subject was 
given 40 test trials, 20 on each of the two dial 
types, four at each of ten viewing angles used. 
Five of the pointer settings for each dial type 
were presented in each of the quadrants of 
the circle. For each dial, half of the settings 
were nearest a graduation mark, and half 
were nearest a mid-mark position. On each 
trial the subject pressed a switch, opening 
the shutter and exposing the dial. When he 
had read the setting, he released the switch, 
closing the shutter, and reported the apparent 
setting to the experimenter, who recorded the 
setting report and the time. 

Results of the First Experiment, Analyses 
of the data revealed no systematic change in 
reading time as a function of viewing angle 
within the limits studied for either dial type 
or direction of slant. A fairly systematic 
trend exists, however, for errors with de- 
creasing viewing angles. A graph of the per- 


484 
30 
wo 
c 
S 
ira 
X 
ws 20 
= 
z 
W 
© 
ir 
49 


Cohen, Vanderplas and White 


ee ee y ee Ùi (Seer: eee 


90° 


80° 70° 


60° 


° 
50° 40° 30° 20 


VIEWING ANGLE 


Fic. 2. Per cent readings in error by 5 units or more at each viewing angle. Experiment I. 


centage of the total readings in error by five 
or more units made at each viewing angle by 
all the subjects is presented in Figure 2. The 
trend for errors to increase with decreasing 
viewing angles is represented by a cosecant 
function. This trend might also be repre- 
sented equally well by a straight line or many 
other possible functions; the reason that a 
cosecant curve was used will be discussed 
later. The data for both dials and both 
directions of slant are combined, since there 
were no real differences between the two dials 
or directions. 

Before discussing the results of the above 
described experiment, let us first turn to the 
second experiment, on the effects of parallax, 
The second experiment was designed to iso- 
late, if possible, the effects of the introduction 
of parallax * in various amounts upon errors 
in pointer location. It was felt that the ef- 
fects of parallax could be studied best in a 
situation which was fairly simple and in 
which errors as interpolation, numeral identi- 
fication and other factors were not present. 
Apparatus was therefore designed so that the 
subjects could align a pointer with a single 
mark against a homogeneous background, on 
the assumption that the setting errors thus 
measured would reflect the errors in perceived 
location. 

_ The apparatus for this experiment con- 
sisted of a white cardboard upon which was 

2 By parallax we mean the amount of 


p di 
of the pointer from the plane of the dial p Placement 


painted a single black line three inches high 
by one-eighth. inch wide. A black poin 
of similar dimensions was set in front of the 
mark on a track to permit the pointer to et 
laterally to various positions in front of t a 
mark and background. The subjects con 
trolled the position of the pointer by putting 
alternately on a pair of strings. The op i 
menter could read the position of the point 
from behind the apparatus by referring tO © 
meter stick calibrated to the pointer posit!? t 
A white cardboard screen allowed the subjes, 
to see only the background, black mark = 
pointer, b- 
Seventeen college students acted as S 
jects. They were instructed to set the poi je 
on a line perpendicular to the plane of ne 
mark and in line with the mark. The P”go 
was set at various viewing angles between 
degrees and 20 degrees inclusive, both left ° 
right. Pointer and mark were displace jew” 
distances of 1⁄4, 14, 1, and 11⁄4 inches. Vief 
ing distance was ten feet. Two settings a 
and right, were made by each subject at °° 
of nine viewing angles, left and right, e 136 
each of the four displacements, a total ° nta 
settings per subject. Sequences of pres’ and 
tion of the viewing angles, displacements» “jot 
left and right settings were randomiz? 
each subject. pP 
_ Results of the Second Experiment. : ale 
liminary analysis of the error data "eY wi 
that the distributions of errors at each Y? 
angle were neither normal nor homog® 


g 


ŘAD oo” —”":~— 


è 


Effect of Viewing Angle and Parallax 


+5- 


---- 11/2" 


MEDIAN CONSTANT ERROR (MILLIMETERS) 


TT 80° 70° 


Fic. 3. Median constant error for each pointer-mark 


tance was 10 feet. 


With respect to variability. The distributions 
were symmetrical at the 90 degree position, 


Ut, as viewing angle was decreased the dis- 


tributions became both more skewed and 


aie variable. For this reason the medians 
7 
vere used rather than the means as average 


20 


oa 


MEDIAN AVERAGE ERROR (MILLIMETERS) 
6 


70° 60° 


Fig 

. A e 

tance © Median average error for 
Was 10 feet, 


60° 
VIEWING ANGLE 


ach pointer-mar 


50° 40° 30° 20° 


displacement at each viewing angle. Viewing dis- 


measures of errors at each viewing angle and 
displacement. 

Figure 3 represents the data of the experi- 
ment, and is plotted in terms of the median 
constant error (in which the direction of 
error is considered) as a function of viewing 


50° 40° 30° 20° 


VIEWING ANGLE 


k displacement at each viewing angle. Viewing dis- 


486 


Y 
o 


MEDIAN AVERAGE ERROR (MILLIMETERS) 
roy 


90° 80° 70° 60° 


Cohen, Vanderplas and White 


‘ 
50° 40° 30° en 


VIEWING ANGLE 


Fic, 5. 
was 10 feet. 


angle, for the four pointer-mark displace- 
ments used. The zero point indicates correct 
alignment, positive values indicate settings 
in the direction away from the observer, and 
negative values indicate settings toward the 
the observer, relative to correct alignment. 

It can be seen from Figure 3 that a trend 
exists here similar to that found in the first 
experiment, namely, increasing error with de- 
creasing viewing angle. In addition, there 
appears to be an inverse relation between 
constant errors and displacement, the small 
displacements yielding apparently greater 
positive constant errors, while the large dis- 
placements yield small or negative constant 
errors (in the direction of the observer). 

A similar trend appears for errors to in- 
crease as a function of viewing angle if the 
median average error (in which direction is 
not considered) is plotted. Figure 4 con- 
tains a graph of these data plotted for each 
of the four displacements. The relation of 
amount of displacement to error is not so 
clear here as it was for constant error. The 
amount of the error as 
median constant error, 
versely related to the am 
In Figure 4, it appears 


appeared to be in- 
ount of displacement, 
that the greater the 


Median average error for all pointer-mark displacements 


indicated by the: 


bats teine distance 
at each viewing angle, Viewing dist 


displacement, the greater is the error aS 
ability, as measured by the median averte 
error. : 

If all the data are combined and the medien 
average error is plotted as a function of ie 
ing angle for all displacements combine’ 
the data as shown in Figure 5 appear. ‘8 
curve fitted to these data is again a very © a 
approximation to a cosecant function: it aii 
pears to fit the data quite well and in abt 
the same way as in the first experiment. 


Discussion 


There appear to be at least two effects nl 
volved in the experimental data. The oe 
may be termed the effect of decreased ‘ot 
Parent distance between successive points , g 


Jne 
” A . riew! 
the scale as a function of decreasing Y o 


t 
angle; the second appears to be the effec 


ecale 

displacement between the pointer and mn 

plane. Apparently, for the limits © {o 

Study, large pointer displacements i rors: 

small constant errors but large variable €" an 
while small displacements lead to [arse 

Consistent constant errors. jeas! 

These results may be interpreted, a th? 

| 2 Preliminary way, by consider® gn 
visual angle relationships which vary 


in 


Effect of Viewing 


comitantly with changes in viewing angle. 
A decrease in viewing angle results in a cor- 
responding decrease in the visual angle sub- 
tended at the eye by a given mark separation. 
It was noted earlier that in Figure 2 the data 
were represented by a cosecant function. 
They might have been fitted by a straight 
line. However, since the projection of the 
mark separation distance decreases propor- 
tionally to the sine of the viewing angle, we 
would expect errors to be inversely related to 
the sine, or directly proportional to the cose- 
cant, of the viewing angle. At zero degrees 
Viewing angle, of course, the cosecant function 
is infinite, and we would also expect errors 
to be extremely large or erratic, since the 
Mark separation, as projected on the retina, 
Would be zero, and the dial face would prob- 
ably not be visible. The interocular distance 
is purposely in the geometrical 
analysis, 

_ It would be expected from this interpret 
tion that errors, in reading to a given criterion 
of accuracy, would remain substantially neg- 
igible until the visual angle subtended by the 
Criterion distance approached or diminished 
elow some minimum discriminable angle. 
we compensate for this decreasing visual 
angle, it may be necessary, in a dial reading 
Situation, only to increase the mark separa- 
lon so that the visual angle subtended by thie 
distance representing criterion error toleranse 
remains above the minimum discriminable. 

_ AS an example of this kind of mAN 
aan we may consider a situation n w ms 
rmal observer is required to reac en 
tative scales, If the illumination is 80° A = 
it is assumed that the accuracy © ce 
readings does not require discriminations 
Mer than about one minute of visual AE 
Ne observer could discriminate Po! hs 
wParate if they are 0.008 inch apart a 
Viewing distance of 28 inches. Thus, u 


i since 
Conditions of the first experiment, 5 o 
; about 0.125 inch. 
i (for 


ignored 


a- 


w5 
ied that of which the norm 
req i r incre: 

le. Rationally, gross 10 S 


r 2 
be tency under these condition age 
C . : g| > 
€xDected until the viewing 27° 


ase: 


Angle and Parallax 487 
creased to a value of about 30 degrees. This 
expectation is borne out reasonably well by 
the data. Presumably, under levels of il- 
lumination which are below that required for 
a discrimination of one minute of visual angle, 
a correction would be necessary for the de- 
crease in acuity at the lower level. 

It should be clear, of course, that the above 
interpretation may be applied only when 
parallax is not present (i.e., when scale marks 
and pointer are not displaced). It was seen 
from the results of the second experiment that 
different amounts of parallax lead not only 
to different constant errors (both in direction 
and amount), but to different variability as 
well. It can be seen from an examination of 
Figure 3 that there was a consistent tendency 
for the subjects to set the pointer too far 
away, to “overcompensate,” for the amount 
of parallax present and that this tendency 
became more pronounced with decreasing 
viewing angle. Tentatively, a definition has 
been constructed for parallax in terms of the 
visual angle subtended at the eye by the 
pointer-mark displacement distance, and a 
theoretical function has been derived to ac- 
count for these kinds of errors. A study is 
planned to test this theoretical function in a 
scale reading experiment. It is hoped that 
such an approach will serve a two-fold pur- 
pose of providing a basis upon which to pre- 
dict the errors to be expected in a practical 
sense, and at the same time provide a better 
understanding of the functional relationships 


involved. 
Summary 


Two experiments were performed to eval- 
uate and quantify the effects of decreased 
viewing angle and parallax upon accuracy of 
reading instrument scales. In the first ex- 
periment, viewing angles were varied, and 
subject-controlled tachistoscopic presentation 
of the stimulus dials was used. Dial photo- 
graphs were used as stimulus materials to 
isolate effects of viewing angle from those of 
parallax. The results show that reading 
errors increased as viewing angle decreased 
from 90 degrees to 25 degrees. Reading time 
was unaffected by changes in viewing angle. 

In the second experiment the effect of 


488 Cohen, Vanderplas and White 


parallax was studied by requiring the sub- 
jects to align a movable pointer with a mark. 
The apparatus was set at viewing angles be- 
tween 90 degrees and 20 degrees, and four 
pointer-mark displacements were used. The 
average error of the settings increased as the 
viewing angles decreased. The increase is 
approximated by a function proportional to 
the cosecant of the viewing angle. The con- 
stant error tends to increase systematically 
with viewing angle. With increasing pointer- 
mark displacement the average error tends to 
increase, but the constant error is inversely 
related to the amount of pointer-mark dis- 
placement. 

The error curves in both experiments are 
approximated by a function proportional to 
the cosecant of the viewing angle. An inter- 
pretation in terms of a least discriminable 
visual angle is advanced to account for the 
results. 


Received January 26, 1953. 


References 


1. Barr, M. L. Visibility of cockpit instruments. J. 
Aviation Med., 1950, 21, 328-342. 

2. Bartlett, F. C., and Mackworth, N. H. Planned 
seeing; some psychological experiments. I. 
Visibility in the control of fighter command. 
II. The synthetic training of pathfinder air 
bombers in visual centering on target indi- 
cators. Air Ministry Publication No. 3139 b. 
London: His Majesty’s Stationery Office, 1950. 

. Calvert, E. S. The scientific basis for the new 
British system of cockpit lighting. Elect. 
Engng., N. Y., 1944, 63, 869-870. 

. Du Bois, E. F. The anatomy and physiology of 
the airplane cockpit. Aeronaut. Engng. Revs 
1944, 4, 15-21. 

. G. B. Royal Aircraft Establishment. Layout of 
cockpits with particular reference to night 
visibility and lighting. G. B. Royal Aircraft 
Establishment Electrical Engineering Depts 
Memo No. 727, February, 1942. 

. Kappauf, W. C. Studies pertaining to the design 
of visual displays for aircraft instruments, 
computers, maps, charts, tables, and graphs: 
a review of the literature. USAF, Air Ma- 
teriel Command AFTR No. 5765, April, 1949. 


we 


eN 


ur 


a 


e 


—— ee U, 5 


X 


Tan JOURNAL oF AvpLinn PsyenoLocy 
ol. 37, No. 6, 1953 


Visual Tracking: III. 


The Instrumental Dimension of Motion 


in Relation to Tracking Accuracy ' 


Robert S. Lincoln 
The Johns Hopkins University 


In the typical tracking task a human opera- 
tor is required to make corrective responses 
to visual cues provided by the relative posi- 
tions and rates of travel of a target and a 
cursor or follower. With modern equipment 
the Operator seldom manipulates the cursor 
itself. Rather he effects his control through 
a mechanical or electrical device that links 
him to the controlled cursor. This connect- 
‘ng mechanism usually alters the relationship 
between the operator’s controlling motions 
and the actual movements of the cursor. The 
Nature of the alteration is determined by the 
Characteristics of the tracking instrument. 
Systematic variation of these characteristics 
defines the instrumental dimension of track- 
mg motions. 

Within this dimension three different types 
Of instrumental alteration of motion are of 
Special interest since all of them have been 
incorporated in remote-control tracking equip- 
Ment. These alterations may be termed: 
(a) translations, (b) transformations, and 
(c) integrations of motion. Translated mo- 
tions directly reflect the characteristics of the 
Motions made by the operator, but the trans- 
lated motions are either amplifications oF 
reductions of the operator’s motions. Trans- 
Ormed motions reflect the operator's move- 
ments only in a special way since the system 
Output is not a direct counterpart of the me 
tion input (2). In the tracking device con- 
Sidered in this paper, the transforming instru- 
Ment changes an input of extent of move- 
Ment into an output of velocity of movement. 


i * This report is based on a dissertation nes 
grep *ttial fulfillment of the requirements Say o 

is Cf Doctor of Philosophy at the Ondueted un- 
dep onsin in 1952. The research was COn whom the 
Write. €, direction of Dr- Karl U. Smith to ¥ research 
wa et is greatly indebted. Support for fhe TOE the 

aqetovided by the Research committer py the 
Legitte School from special funds “Re orted 
the slature of the State of Wiscons ical ‘Asso- 
Ciatign > Meeting of the Eastern Psycho 


Integrations of motion involve the combina- 
tion into one output of a simultaneous trans- 
lation and transformation of the same move- 
ment of the operator. 

The various instrumental alterations of mo- 
tion that have been described are produced 
by direct, velocity, and aided tracking sys- 
tems, respectively. Table 1 indicates the 
component motions required to achieve con- 
trol of the position and rate of travel of the 
cursor in each of the three types of tracking, 
and the alterations of these motions that are 
produced by the different tracking systems. 

As Table 1 shows, direct tracking involves 
two translations of motion while the velocity 
tracking mechanism produces two transforma- 
tions of motion. In aided tracking an in- 
tegration of the simultaneous translation and 
transformation of the same positioning motion 
is developed by the tracking device. 

This study was designed to provide in- 
formation related to three main questions: 


Table 1 


Instrumental Alterations of the Operator Motions 
Required to Achieve Control of the Position 
and Rate of Travel of the Cursor 


Type of 

Tracking Positioning Motion Rate Motion 

Direct Translated into cursor Translated into rate 
positioning of cursor travel 
Transformed into di- 
rection of cursor travel 

Velocity on 
Transformed into rate 
of cursor travel 

5 Translated into cursor 

positioning 

Aided Transformed into rate 


of cursor travel—the 

translation and trans- = 
formation are inte- 

grated. 


489 


490 


1. Are the characteristics of the curve of 
skill acquisition in tracking changed by the 
different instrumental alterations of motion? 

2. Are the effects of practice sufficient to 
overcome differences that may exist in the 
difficulty of operation of control systems that 
produce translations, transformations, or in- 
tegrations of control movements? 

3. What transfer effects appear when train- 
ing on one type of tracking is followed by 
transfer to another type? 


Apparatus and Procedure 


The apparatus used in this study has previ- 
ously been described in some detail (3, 4, 5). 
The operator’s task is to align a cursor and a 
moving target by means of a handwheel control. 
The target moves over a circular course that in- 
cludes numerous reversals in the direction of 
target movement and continuous changes in target 
velocity. 

In order to compare the levels of accuracy that 
are obtained with direct, velocity, and aided con- 
trols, a special device has been constructed that 
permits a rapid change from one type of tracking 
to another without the alteration of other criti- 
cal features of the tracking task. 

Tracking-accuracy scores are obtained with a 
mechanism which integrates the tracking error 
record and provides a summated accuracy score 
on the dial of an electric clock. 

Prior to the experimental comparison of the 
different types of tracking, data were obtained on 
the optimum ratios of handwheel-to-cursor dis- 
placement (4). These optimal ratios were used 
in the comparison of the three types of tracking 
in order to insure that obtained differences in ac- 
curacy levels would not be a reflection of arbi- 
trarily chosen displacement ratios. For all ratios 
used in aided tracking, an aided tracking time 
constant of 0.5 second was maintained (6). 

Three groups of 18 subjects each were used in 
the study of training and transfer of training ef- 
fects. Subjects were randomly assigned to train- 
ing groups, and a different group of subjects was 
trained on each of the three types of tracking 
The training sessions extended over a period of 
six successive days. On each training day all 
subjects received ten one-minute tracking trials 
A 25-second rest pause was permitted between 
trials. 

, Before the training trials were begun, all sub- 
jects were given an explanation of the trackin 
task and of the mechanical features of the E 
trol that they would use. A brief demonstration 
of the control was also provided. This Sarit me 
planation and demonstration was given e, T 
jects who transferred to a new type of tra ki : 
during the transfer trials. Subjects received a 
information concerning their accuracy in track- 


Robert S. Lincoln 


ing other than that provided by the visual dis- 
play of the apparatus. 

The effects of transfer were observed on the 
seventh day of the experiment. At this time six 
subjects from each of the training groups re- 
ceived ten trials on one of the two types 0 
tracking on which they had not been trained. 
Six different subjects from each of the training 
groups received ten trials on the second of the 
two types on which the groups had received no 
training. The remaining six subjects in each 
training group continued with the type of track- 
ing on which they had been trained. These con- 
trol subjects made up the additional-practice 
groups. A matching procedure was used to 
equate the various transfer and control groups 
on the basis of their accuracy scores achieve 
during the training period. 


Results 

Results of Practice. Performance curves 
for all training groups are shown in Figure i. 
In this figure mean accuracy scores are plotted 
for each trial. The training curves indicate 
that those subjects who trained on direct 
tracking made the highest mean accuracy 
score on every trial throughout the entire 
training period. Those subjects who trained 
on velocity tracking made the lowest accuracy 
scores. The accuracy of aided tracking COn- 
sistently fell between these two extremes al- 
though, after the first few trials, aided ac- 
curacy closely approached direct accuracy: 

From these results it is apparent that n 
mechanical devices developed as an aid to the 
tracking operator are of no general value 
under the present experimental conditions: 
It is quite possible, however, that different 
results might be obtained with a more wr 
form target course, or in situations that T° 
quire the operator to track continuously f0" 
long periods of time. 

In order to evaluate the significance = 
some of the characteristics of the practic? 
curves, a test of trend (1) was applied to i 
training data. In this test the mean accuracy 
Scores for days, rather than trials, were usec; 
Before the test of trend was carried outs | 
r necessary to perform an arc sin a 
arunti (7) of the mean scores after > 

Score had been calculated as 4 P 
centage of the maximum possible anol 
a This procedure was required to a r 
vai egative correlation between the means a 

riances of the training groups. 


Visual Tracking: III. Instrumental Dimension of Motion 


IN SECONDS 


ACCURACY 


Fic. 1. 


trained on direct, aided, or velocity 


491 


oeo 


DIRECT 
AIDED 
VELOCITY’ ‘e*:#=r* 


o--0--0 


30 40 50 60 


TRIALS 


The levels of accuracy reached by groups of subjects who 


tracking. The vertical lines indicate 


the blocks of ten trials which made up a day’s run. 


According to the trend test, the training 
curves show significant (p < 001) deviation 
from linearity. The curves for the separate 
groups also differ significantly in regard to 
the degree with which they depart from line- 
arity. In addition, the slopes of best-fitting 
straight lines differ between groups. 

Results of Transfer. The data obtained 
during the transfer trials were subjected to 
an analysis of variance. Before the analysis 
was begun, the accuracy scores were trans- 
formed in the same manner as the training 
scores. The analysis of variance showed all 
of the main sources of variation to be sig- 
nificant (p < .001), but a significant inter- 
action (p < .001) between training and trans- 
fer types of tracking greatly modified the 
importance of the main variables. This re- 
sult indicates that the effects of training are 
highly specific in nature since no type of 
training led to superior over-all performance 
when transfer was made to all three types of 
control. Rather the effects of training de- 
pended upon the type of tracking to which 


transfer was made. 


As has been pointed out, direct tracking 
involves two instrumental translations of the 
operator’s control motions, while the velocity 


50° 
o] 
= 40: 
O 
o 
W 
o] 

30 
= 
> 
O 2x 
a TRAINING TYPE TRANSFER TYPE 
3 DIRECT o DIRECT 
2 i AIDED a AIDED ——-——— 

VELOCITY © VELOCITY — — —— 
o 
o 1 2 
NUMBER OF COMMON INSTRUMENTAL 
RELATIONSHIPS 

Fic. 2. Accuracy in tracking as a function of the 


number of common instrumental relationships be- 
tween training and transfer tasks. 


Robert S. Lincoln 
492 


TRANSFER TYPE 


DIRECT 


+16. 


+14 


+2 


+10: 


ACCURACY SCORE DIFFERENCE 
ACCURACY SCORE DIFFERENCE 


VELOCITY 


AIDED 


+16. 
+14 
+12 


+10: 


VA 


ACCURACY SCORE DIFFERENCE 


-$ 


TRAINING TYPE 


Fic. 3. Positive and negative transfer effects. 
represents the initial accuracy level achieved by un 


The zero point on each graph 
trained subjects on the transfer 


types. Plus deviations from the zero line indicate positive transfer effects, while 


minus deviations indicate negative transfer effects. 


amount of transfer effect produced when subjects are trained on di 
(D), velocity tracking (V), or aided tracking (A), 


or a different type of tracking. 


mechanism produces two transformations of 
the operator’s motions. Aided tracking in- 
volves one translation and one transformation 
of motion. Direct and velocity tracking. 
therefore, have no common instrumental re- 
lationships while aided tracking has one rela- 
tionship in common with both direct and 
velocity tracking. Figure 2 shows that the 
amount of transfer, as measured by accuracy 
scores, is directly related to the number of 
instrumental relationships that are common to 
both the training and transfer tasks. The 
figure indicates (for example) that trai 
on aided tracking led to greater accu 
upon transfer to the velocity control 
did training on direct tracking. A similar 
interpretation may be applied to the other 
points on the curves. 

In Figure 2 the additional-practice 
are shown as having “transferred” 
type of control as the one on w 


ning 
racy 
than 


groups 
to the same 
hich they had 


All graphs show the type and 
rect tracking 
and later transfer to the same 


trained. The relative positions of these 
groups indicate that direct tracking was still 
slightly superior to aided tracking on the 
seventh day of the experiment. 

Another suggestion of transfer effects may 
be obtained from a comparison of the, oe 
curacy scores achieved by untrained subjects 
on a given type of tracking with the scores 
made by subjects who transfer to that tyP® 
following training on another type. Figure 
Pictures this kind of transfer effect. The 2¢'? 
point on the ordinate of each graph repre” 
sents the mean score made on the first te 
training trials by the 18 subjects who traine 
on each of the transfer types. The remainin’ 
ordinate values indicate the amount and n 
rection of the differences between the je 
scores for the untrained subjects and ae 
Scores made by the six subjects who arent 
ferred to the different transfer types a 
training on either direct, velocity, or a 


Visual Tracking: III. Instrumental Dimension of Motion 493 


tracking. In Figure 3 the additional-practice 
groups are again shown as having “trans- 
ferred” to the type of control on which they 
were trained. 

The significance of these transfer effects 
was evaluated by means of ¢ tests that were 
performed on untransformed accuracy scores 
after F tests had indicated the homogeneity 
of the variances of the different training and 
transfer groups. All transfer effects are sig- 
nificant (p < .05) with the exception of that 
effect produced by transfer to velocity track- 
ing after training on direct tracking. 

These results indicate that the prediction 
of transfer effects must take into account the 
direction of transfer as well as the num- 
ber of instrumental relationships common to 
both the training and transfer tasks. Train- 
ing on direct tracking produced a positive 
effect upon transfer to the aided control and 
no significant effect upon transfer to the 
velocity control, while training on either 
velocity or aided tracking produced a nega- 
tive effect upon transfer to the direct control. 


Summary and Conclusions 


This study was designed to provide in- 
formation concerning the acquisition and 
transfer of skill in the operation of remote 
control devices which produce instrumental 
translations, transformations, and integra- 
tions of the operator’s controlling motions. 
These instrumental alterations of response are 
produced by direct, velocity, and aided track- 
ing systems. : 

Each of three groups of 18 subjects re- 
ceived training on either direct, velocity, or 
aided tracking for a period extending through 
six successive days. On the seventh day of 
the experiment 12 subjects from each train- 
ing group transferred to different types of 
tracking while the remaining six subjects in 
each group continued to track with the con- 
trol on which they had been trained. Ac- 
curacy scores achieved by the subjects were 
analyzed with regard to the effects of both 
practice and transfer. The results of the ex- 
periment suggest a number of conclusions. 

1. The instrumental characteristics of con- 
trol devices are prime determinants of the 


accuracy with which those devices may be 
operated. 

2. Practice curves for the three types of 
tracking show the typical negative accelera- 
tion which has previously been demonstrated 
in studies of direct tracking behavior. 

3. For complicated target courses, the ac- 
curacy of direct tracking is consistently su- 
perior to both aided tracking and velocity 
tracking. Aided control is also far superior 
to velocity control. 

4. Within the limits of this experiment, the 
effects of practice are not sufficient to elimi- 
nate the differences in accuracy achieved with 
the three types of tracking. 

5. The effects of training are highly spe- 
cific in nature. The best performance in 
transfer to any type of tracking is achieved 
by subjects who are trained on that specific 
type. 

6. Negative transfer effects appear when 
subjects transfer from aided or velocity track- 
ing to the direct control, while positive trans- 
fer effects appear in the reverse situation. 
The amount of transfer is directly related to 
the number of instrumental relationships that 
are common to both the training and transfer 
tasks. 


Received January 12, 1953. 


References 


1. Alexander, H. W. A general test for trend. Psy- 
chol. Bull., 1946, 43, 533-557. 

2. Craig, D. R. and Ellson, D. G. The design of 
controls in Human Factors in Undersea War- 
fare. Washington, D. C.: National Research 
Council, 1949. 

3. Lincoln, R. S. and Smith, K. U. Transfer of 
training in tracking performance at different 
target speeds. J. appl. Psychol., 1951, 35, 
358-362. 

4, Lincoln, R. S. and Smith, K. U. Systematic 
analysis of factors determining accuracy in 
visual tracking. Science, 1952, 116, 183-187. 

5. Lincoln, R. S. and Smith, K. U. Visual tracking: 
Il. Effects of brightness and width of target. 
J. appl. Psychol., 1952, 36, 417-421. 

6. Mechler, E. A., Russell, J. B., and Preston, M. G. 
The basis for the optimum aided-tracking 
time constant. J. Franklin Inst., 1949, 248, 
327-334. 

7. Snedecor, G. W. Statistical methods. Ames, 
Towa: Iowa State College Press, 1946. 


Tue JOURNAL OF APPLIED PSYCHOLOGY 
Vol. 37, No. 6, 1953 


Identification of Cola Beverages Overseas * 


E. Terry Prothro 
Brooklyn College 


Bottling and sale of cola beverages is now 
taking place in many countries around the 
world, and consumption of these beverages 
has become a part of the life of inhabitants 
of every continent. Both Coca-Cola and 
Pepsi-Cola are widely sold in Lebanon, for 
example, and their popularity has stimulated 
the production of several imitations. It there- 
fore seems worthwhile to determine whether 
or not consumers can differentiate the Ameri- 
can beverages from each other and from the 
local colas on a basis of taste. 

A series of investigations by Pronko and 
others indicated that subjects could not 
identify American colas better than chance 
when many different brands were presented 
(1), but that they could identify Coca-Cola 
significantly more often than chance when 
only the three leading brands were used (2). 
It was therefore decided to use only three 
beverages, including Coca-Cola, in this in- 
vestigation, so that there would be maximum 
opportunity to reveal taste differences be- 
tween the beverages. 


Procedure 


The three leading cola drinks of Lebanon 
were used in this study. These three, in 
order of popularity are Coca-Cola, Pepsi- 
Cola, and Williams Champagne Cola. The 
American colas are bottled in local plants but 
according to a special and presumably secret 
process dictated by the parent corporations. 
Champagne Cola is produced by the Williams 
plant, a Lebanese organization which also 
produces many other soft drinks. This cola 
resembles the American drinks in appearance 
It was introduced after the early success of 
Coca-Cola. Coca-Cola has been distributed 
there for nearly three years, Pepsi-Cola for 
six months, and Williams Cola for more than 
one year. 


* This study was conducted overse i 
yy seas while the 
thor was teaching at the Ameri Taare mi- 
Beirut. 7 ican University of 


A total of 60 students of the American Uni- 
versity of Beirut-volunteered to serve as sub- 
jects. Each subject stated that he was fa- 
miliar with the taste of the beverages used. 
that he was not suffering from a cold, that 
he had no political or religious objection to 
any of the beverages. He was then told that 
he would be given each of the three colas in 
series, and that he was to identify each after 
its presentation. Approximately 2 oz. of 
refrigerated cola were used at each presenta- 
tion. The beverages were presented in iden- 
tical 6 oz. glasses. Subjects were blindfolded 
during the trials. Approximately one minute 
elapsed between each trial, during which time 
the subjects were asked to rinse their mouths 
with water. There are six possible arrange- 
ments when three colas are presented in series- 
Each of the six arrangements was used for 
ten subjects. 


Results 

Although the subjects were informed that 
each of the three colas would be presented, 
some of them felt that instructions in a psy“ 
chological experiment cannot be relied upo”- 
One subject believed that a single cola was 
Presented three times, and some believed that 
other colas were being presented. From Table 
1 it can be seen that the most recently intro- 
duced cola (Pepsi-Cola) was named most 
often. This fact may be a result of the €% 


Table 1 


pP Cola Coca p m 

; ‘ocd epsi agne Total 
ented Eola Coa Goh Oma 1S 
Coca Cola 24 a § 1 i” 
Pepsi 26 28 : f 
Champagne 1 6 si z a 
: é T 
a 51 64 61 4 a 


Identification of Cola Beverages Overseas 495 


tensive advertising campaign that accom- 
panied its introduction. 

Our subjects were not able to differentiate 
between the two American colas. Indeed 
Coca-Cola was called Pepsi-Cola more often 
than it was named correctly. On the other 
hand, the subjects did identify the local cola 
quite well, and showed little tendency to con- 
fuse it with the American colas. If we employ 
the chi-squared test of significance, we find 
that the American colas are identified cor- 
rectly only slightly, and insignificantly, more 
often than chance. The correct identification 
of Williams Champagne Cola was not at- 
tributable to chance. The superiority to 
chance was significant at the .001 level. 


Summary 


A total of 60 students in American Univer- 
sity of Beirut were asked to identify Coca- 


Cola, Pepsi-Cola, and a popular local cola 
without reliance on visual stimuli. Although 
these colas are widely distributed locally, and 
the subjects stated that they were familiar 
with them, it was found that the American 
colas could not be differentiated from each 
other. The local product could however be 
distinguished from the American brands in 
spite of the fact that it is an imitation of 
them. 


Received March 3, 1953. 


References 


1. Pronko, N. H. and Bowles, J. W. Identification 
of cola beverages. III. A final study. J. appl. 
Psychol., 1949, 33, 605-608. 

2. Pronko, N. H. and Herman, D, T. 
of cola beverages. IV. 
Psychol., 1950, 34, 68-69. 


Identiñcation 
Postscript. J. appl. 


THE JOURNAL oF APPLIED PSYCHOLOGY 
Vol. 37, No. 6, 1953 


Applied Psychology in Action 


The Non-Directive Approach in Advertising Appeals 


Howard D. Hadley 


Daniel Starch and Staf, Mamaroneck, N. Y. 


In recent years a new basic approach to 
psychotherapy has been developed. It is 
called the non-directive method. Because it 
has some implications for advertising, there 
may be value in describing this technique 
along with its counterpart—the directive 
method. 

Non-directive psychotherapy is built around 
these central concepts: 

1. That all individuals have the basic ca- 
pacity to understand the forces in their lives 
which cause them unhappiness and pain. 
Moreover, they also can understand those 
forces which lead to pleasure and well-being. 

2. That all individuals, by personal effort, 
ultimately can overcome the bad or enhance 
the good forces in their lives. 

3. That this process is made easier, quicker, 
or more effective in an atmosphere that is 
friendly, sincere, and understanding. 

In non-directive therapy, the therapist does 
not assume any of the usual responsibilities 
such as prescribing treatment or even directly 
defining the cure. Instead, the therapist at- 
tempts to set up with the individual a rela- 
tionship, an atmosphere, in which the person 
may talk or act without danger of being cri- 
ticized. It is one of complete acceptance 
Within this tender environment, the person 
himself comes to understand and to re-evalu- 
ate the forces operating to make him happy or 
unhappy. 

Directive therapy differs in that it 


3 : usually 

requires a direct assault upon the individual’ 

maladjustments: the person is tested = 
, x 


lyzed, and then told what is wrong and giver 
a prescription for a cure. It js similar to 
_ going to a doctor only to find out you havı 
appendicitis. He puts you in a hospital te, 
moves the diseased organ, and your Lif 
then is able to complete the recovery. Wh 
dealing with the physical body, the dicta 
obligation is often greater than the Patient's 


en ake the Philip Morris theme; 


When dealing with a person’s mind, this 
method sometimes falls down because the per- 
son is reluctant to believe or is unable to 
accept the diagnosis of the therapist. The 
therapist can only discover and point out the 
path, he cannot walk it. Many persons bene- 
fit from this advice but many others do not. 

In advertising it is very similar: there are 
two ways in which to advertise. The adver- 
tiser can fell the person to buy the product 
because it will do this or that for him. This 
's comparable to the directive method where 
the individual is told what his troubles are and 
how to cure them. Or the advertiser can 
create a friendly, sincere, and understanding 
atmosphere which shows the benefits of the 
Product without direct intention to sell. This 
Places the person in a situation where he is 
able to accept new ideas without threat to his 
old ideas. 

To complete the comparison, directive 
therapy is similar to the direct appeal in ad- 
vertising. In each case, the person to be 
influenced is directly told what is wrong and 
how to correct it. The non-directive therapy 
can be compared with the inferred appeal in ; 


things, persons, or events. The accePt®”" de 
of the associated “thing” creates an ye 
2 ou the individual feels free taa 
RE Also, the acceptability i 

amp E transferred to the prodi eal 
be advertisers using the inferre¢ Gold 
= ess, John Hancock, Breck, O14 4er 
. >cagram’s 7 Crown, Each of these ® on 
tisers avoids a direct assault upon m we 
șumer’s credibility but disarms him and fye- 
Introduces g strong simply-worded messy 
Res are’ Some examples from curren aii 


si thi mp” ss 
= ing which help to illustrate the ergot? 


Applied Psychology in Action 


thing Wonderful Happens—.” Here is an at- 
titude that some persons actually attain after 
changing to Philip Morris. At the present 
time, this very personal theme is put across in 
a very direct manner. Readers are now being 
told that “Something Wonderful Happens 
when they change to Philip Morris.” It is 
entirely possible that the theme—which is 
excellent—might be more effective if readers 
were to arrive at this realization without such 
obvious aid. To be told about this antici- 
pated change in the personality tends to act 
as a threat, which leads to resistance and 
withdrawal. The point could be put across 
showing this “Something Wonderful” hap- 
pening to others and then associating it in- 
directly with Philip Morris. 

Basically, persons earnestly want to be- 
lieve advertising, but they are afraid to. This 
fear is the result of the possible “frustration” 
they may suffer because of false and mislead- 
ing advertising when the product fails to live 
up to a person’s expectations. If this happens 
with advertising which uses the direct appeal, 
the consumer blames the advertiser. If this 
failure occurs with a product which has used 
inferred appeals, the consumer is less apt to 
complain 1) because of lack of specific prom- 
ises in the advertising and 2) because he was 
the one who decided what the product would 
do—and not the advertiser. 

To remain in the cigarette field, let’s take 
the king size cigarettes. A person who uses 
the regular length cigarettes is told that he 
is not being economical and is also remiss in 
attention to his health. While both of these 
points have the popular vote behind them, a 
person will tend to treat the advertising as a 
threat to his judgment. Such advertising is 
a negative approach to a negative appeal. 
The inferred (non-directive) appeal would 
show a person enjoying increased smoking 
pleasure from a longer smoke. . 

Throughout this whole comparison, there 
are two distinct philosophies of advertising. 
One of them is to “tell them” and the other 
is to “have them tell themselves.” From evi- 
dence in psychotherapy, and from evidence 
presented in the October 1952 issue of this 
magazine [Advertising Agency], it appears 
that when credibility is lacking on the part of 


497 


the person, the latter (inferred) method is 
more effective. 

Let’s look at the record of some beer ad- 
vertisers who use, in greater or lesser degrees, 
the inferred technique. Beer is taken as an 
example because beer is almost universally 
used, more money is spent for it each year 
than for milk or cigarettes. Also, the re- 
sponse of consumers to beer advertising is 
quicker and greater than for many other mass 
distributed and used products. From 1949 
through 1951 there was a four per cent de- 
crease in the amoynt of beer consumed. 
However, the leader, Schlitz, gained about 20 
per cent. Rheingold had close to a 50 per 
cent gain, and Miller’s almost doubled. While 
there may have been other factors operating, 
it’s hard to deny credit to the type of adver- 
tising these beer companies have been using. 
As a matter of fact, if you look at the top 
brand in any product group, you will usually 
find that they have used the non-directive or 
inferred approach. 

If such an approach is so good, why don’t 
more persons use it? Here are some possible 
reasons: 

1. Most advertisers (not agency personnel) 
cannot resist the temptation to “tell them”— 
to sell the product as an extroverted salesman 
would. This is a very easy pitfall into which 
to fall, and a hard one to leave. 

2. Not everyone is able to use the inferred 
(non-directive) technique. To a large extent 
it is dependent upon the personality of the 
creative persons. Some persons think and act 
in a non-directive manner. They are friendly, 
sincere, and understanding. Because of these 
personality characteristics, they are able to 
create advertising which is in keeping with 
their temperament. Other persons are of a 
different turn. They are better at employing 
stronger, more obvious and promotional meth- 
ods to get across a point. This is not in- 
tended as a criticism since there is certainly 
room in the advertising field for both. How- 
ever, don’t expect one type of person to turn 
out a different type of advertising. It is hard 
(and uneconomical) to “live a lie” and the 
consuming public is quick to catch insincerity 
in advertising. 


498 


3. The direct method still sells goods. This 
is a potent argument. While there are many 
persons who are influenced by it, it seems as 
though an advertiser should use the inferred 
approach if he wishes to be really big. For 
this approach appears to be effective with the 
greatest number of people. 

To summarize the high points: 

1. There is a close parallel between con- 
cepts used in psychotherapy and advertising. 

2. The non-directive technique is quite com- 


Howard D. Hadley 


parable with the inferred technique in adver- 
tising as exemplified by Modess, Breck, John 
Hancock, Old Golds, and Seagram’s 7 Crown. 

3. Both techniques appear to be successful 
because the “patient” or “consumer” arrives 
at judgments by himself without being di- 
rectly told. 

4. Successful use of both methods is very 
dependent upon the personality of the per- 
sons involved—therapists or creative persons. 


Reprinted from Advertising Agency, April, 1953. 


Reading: Stop Wasting Your Time 


Take a look at your mail. In addition to 
the usual flood of letters, ads, memos, it 
probably contains a couple of newspapers, a 
rash of magazines, an occasional book or two, 
and a shower of releases, pamphlets, broad- 
sides, etc. Management personnel spend 
about 15 hours a week just reading. And in 
many jobs you may well spend more. But 
how much of it is wasted? Tests show the 
average businessman reads only slightly better 
than an eighth-grade schoolboy—and that is 
still above the national level. Trouble is, few 
people have ever received any reading training 
after elementary school. More and more Ese 
ecutives are aware of this handicap, are turn- 
ing to reading development firms like The 
Reading Laboratory in New York and Chi- 
cago’s Foundation for Better Reading. These 
groups specialize in training executiv 
faster, better. Goal of the 20-hour 
a reading speed of 650-700 words 
National average is about 250 wo 
taking the course do far better, 


es to read 
course js 
per minute, 
rds, Many 


Reading Laboratory Director K. P. Bald- 
ridge points out that reading speed will, of 
course, vary with the difficulty of the material. 
But you can read even legal and scientific 
matter faster with proper training. 

Procedure starts off with an eyesight check. 
follows with photos of eye motion. The Lab 
also uses a battery of diagnostic tests to d€- 
termine vocabulary level, reading speed ast 
comprehension, and reading mechanics. +% 
perts stress that anyone's reading can be na 
proved. While professional guidance ane 
special equipment is needed for difficult case" 
and major advancement, Reading Lab Pe 
sonnel point out you can progress 0” yo 
own. A good vocabulary is essential to La 
mg skill. As a business executive yout °% 
1s well above average now, but there is alway 
room for improvement. There are sever 
Bood systems now on the market if YoU T 
yon need them, Tests show, incidentally, 


high 5 
51 correlation between vocabulary leve 


h 
Seneral executi es ‘on Age, Mare 
S 1953.) ive ability. (Zron Ag 


» 


Book Reviews 


Arsenian, S., Ed. Zn Memoriam—Rudolf 
Pintner. Washington, D. C.: Gallaudet 
College Press, 1953. Pp. 1-63. Gratis. 
The idea of a memorial volume for Dr. 

Pintner originated with his many former stu- 

dents, colleagues, and friends and was carried 

through to completion by Seth Arsenian, Edi- 
tor and Chairman of the Pintner Memorial 

Committee. The volume contains a portrait 

of Pintner (1884-1942), a foreword by the 

Editor, a tribute prepared for the Faculty of 

Philosophy of Columbia University by H. L. 

Hollingworth shortly after Dr. Pintner’s un- 

timely death, and an annotated bibliography 

of Pintner’s contributions beginning in 1912. 

There are 182 annotations in all for the 30 

years or an average of six per year. 

Copies are being distributed to college and 
university libraries in the United States and 
psychologists who are interested in securing 
a copy may write to Gallaudet College. 

This is an appropriate type of memorial be- 
cause Dr. Pintner was an indefatigable worker 
and the quality and quantity of his research 
articles, books, and tests helped put Ameri- 
can psychology in a position of world leader- 
ship in the field of mental measurements. It 
is especially worthwhile to have this extensive 
annotated bibliography available in view of 
the fact that the ambitious plans of Murchi- 
son to publish bibliographies in successive 
editions of The Psychological Register were 
abandoned for financial reasons some twenty 


s ago. 
ita Donald G. Paterson 


The University of Minnesota 


Karn, H. W. and Gilmer, B. von H. Read- 
ings in industrial and business psychology. 
New York: McGraw-Hill, 1952. Pp. 476. 
$4.50. 

Blum, M. L. Readings in experimental in- 
dustrial psychology. New York: Prentice- 
Hall, 1952. Pp. 455. $4.75. 

Both these books were prepared primarily 
as supplementary texts for courses in indus- 
trial psychology. 

Readings in Industrial and Business Psy- 
chology consists of 53 selections, mostly re- 
cent journal articles, covering topics com- 


monly found in industrial psychology texts. 
Most of the articles are neither popular nor 
highly technical. The book would be ap- 
propriate for either graduate or undergrad- 
uate students. About half the articles report 
research studies and the others are discursive 
or theoretical. Titles for the eleven parts are: 
Motivation and Morale, Training in Indus- 
try, Analysis and Evaluation of Job Perform- 
ance, Psychological Tests, Interviewing and 
Counseling, Accidents and Safety, Fatigue 
and Worker Efficiency, Market Research, In- 
dustrial Leadership, Industrial Relations, and 
Psychologists in Industry. The editors have 
provided a three or four sentence summary 
preceding each article. 

Readings in Experimental Industrial Psy- 
chology has 63 recent journal articles. Nearly 
all the selections present results of research 
studies. About a third of the book is de- 
voted to common textbook topics: Employee 
Selection, Application Blanks, Training, Mo- 
tivation and Production, Labor Relations, and 
Music in Industry. Another third of the book 
is given to Engineering Problems, Display and 
Control Design Studies done for the Air Force, 
and Research in Visibility and Legibility, 
The remaining third of the book includes 
Marketing Research, and chapters on three 
new measurement techniques: the Flesch 
Formula, Forced Choice, and Critical Inci- 
dents. The editor has prepared a one or two 
page introduction for each of the 14 chapters 
in which he discusses, in a very readable man- 
ner, the importance of and main problems of 
each research area, and also summarizes the 
articles that have been selected for that sec- 
tion. 

How should one go about making a critical 
appraisal of a book of readings in industrial 
psychology? If we ask that the book include 
only articles that were important new contri- 
butions to the field when they first appeared 
then both books are weak. Articles are in- 
cluded in each book which present neither 
new ideas nor important research findings. 
If we expect to find only articles which are 
models of careful and thorough research again 
we will be disappointed in these books. Re- 
search studies are presented which are faulty 


499 


500 


in design and which arrive at unjustified con- 
clusions. For instance, both books report 
validity studies in which item analysis is used 
without any cross validation of results. In 
none of these instances do the writers point 
out that their findings are what Cureton has 
rightly called “baloney.” We might ask 
whether the books give a realistic picture of 
present day industrial psychology. Both 
books fail to meet this requirement also. 
This is not the fault of the editors, how- 
ever, as they were limited by the available 
supply of articles. Many industrial psy- 
chologists do not publish their research at 
all. Research that results in so-called “nega- 
tive” results is frequently not reported and 
the cumulative effect of publishing only “posi- 
tive” findings is quite misleading. Those 
psychologists in the industrial field who do 
publish their work usually do not give an 
adequate account of the many practical dif- 
ficulties they have faced in carrying out 
worthwhile research and in clearly demon- 
strating the significance and value of their 
findings. 

The following criteria, however, seem more 
important to me as a basis for selecting 
articles for a book of readings in industrial 
psychology. 

1. The writer should have a worthwhile 
point to make and should do so clearly and 
briefly but at the same time adequately 

2. The articles should cover as wide a 
variety of problems and approaches as possi- 
ble. í There should be a minimum of over- 
lapping. 

3. The articles as a group should emphasize 
the use of scientific methodology in indus 
trial psychology. ï 

4. The articles should be stim 
terial for group discussion or in 
ticism. : 

In general, both books meet these four re- 
quirements very well. Nearly all the articles 
are very clearly written and need little orn 
explanation by the editors. Blum has ee 
especially successful in emphasizing the : 
of the scientific method. Karn and Gil on 
provide an excellent sampling of the Kinds # 
problems psychologists in industry have m 
frequently dealt with in recent years. Blum 


ulating ma- 
dividual cri- 


Book Reviews 


on the other hand, gives considerable space 
to topics that psychologists in industry have 
not generally devoted their attention to up to 
now. Not many psychologists have been con- 
cerned with equipment design and “biome- 
chanics” for instance. But these are new and 
stimulating areas for research and represent 
fields in which psychologists may be able to 
make many important contributions in the 
future. Both books can easily be used to 
stimulate discussion and criticism. Experi- 
mental research which is reported clearly 1$ 
always good for this purpose. Both of these 
books, in my opinion, will be found useful by 
many instructors for training students in the 
field of business and industrial psychology: 
Philip H. Kriedt 
Prudential Insurance Company, 
Newark, N. J. 


Steiner, Lee R. A practical gitide for troubled 
people. New York: Greenberg, 1952. Pp. 
299. $3.50. 

This book “is intended for the individual, 
still in possession of his reasoning powe!®: 
but who, nevertheless, feels the need for some 
guidance with his life problems . . - to €" 
able him to select the most adequate advisor i 
for his particular woe.” The author’s earlier 
volume Where Do People Take Their Tro- 
bles? exposed the quack. Now Steiner €*- i 
poses the professional consultant. The an 
presented are not single individuals since t ý 
author says, “Both the seekers and the pi a 
titioners, as presented here, are composi | 
characters.” The “composite characte? ne 
of course, the standby of the fiction writer— ' 
not of the objective reporter writing tO" peo 
ple who need correct information. 7 

Several professions are explored: psychia 
try, psychosomatic medicine, psychoanaly 3 
(medical and non-medical), psychology, Sc 
Work, and ministry, There are chapters 0” 
books as aids to ‘the cure of personal prob- 
lems, on good old-fashioned advice, and 0” 
solving one’s own problems. r 

The author uses a standard pattern for ex 
Ploring each profession. There is Some 5 
scription of the field and the training ' 
quired for practitioners. A case pisto? nis 
two shows that even in the profess! 


“quacks” exist and that mediocrity is some- 
times encountered. More cases are given to 
present the profession in a better light. Then, 
there are a few words suggesting how one can 
= select a psychiatrist, a psychologist, or other 
_ professional consultant. 
= This is a disappointing book for several 
reasons, two of which have been selected for 
comment. First, the audience is not kept 
clearly in mind. Why should intelligent peo- 
ple with troubles read page after page about 
the interprofessional tensions and the con- 
fusions of the professions which deal with 
problems of personality? Does knowledge 
that social workers, psychologists, and psy- 
chiatrists differ on who should do therapy 
really help the beautiful Mrs. Kimball who 
is bored at being socially successful and rich? 
There is too much misdirected, non-construc- 
tive, verbal finger-pointing, some of it further 
confused by professional jargon. 

The author assumes the reader will under- 
stand terms like libidinal, censor, id, Freudian, 
= Rankian, Jungian, logical construct, and non- 
= direction. The intelligent reader who has not 
: specialized in the behavior sciences is for- 
- gotten. 

Second, the chapter on psychology has both 
errors of fact and dubious evaluations. For 
instance, is the following sentence generally 
true concerning psychologists a couple of dec- 
ades ago, or now? “Having thus decided to 
study pure science, they promptly concen- 
trated on rats and hamsters, never permitting 
he study of human animals to pollute their 

” Did Thorndike, Terman, Allport, 


‘concentrate on rats? And, many people con- 
sider rat-men Hull and Tolman to have con- 
tributed mightily to psychotherapy and social 
psychology. Even though most psychologists 
would acknowledge some basis for Steiner’s 
comments, what contribution can a dozen 
pages of ridicule of academic psychology make 
to the troubled persons for whom the Guide 
is written? 
= How many readers of this review would 
accept her statement that interpreting apti- 
tude tests is “often called occupationalogy 
“(sic)”? In a small sample, this reviewer 


Book Reviews 


501 


found no specialist in the field who ever had 
heard this term. Consider this statement: 
“If the counsellor is specializing in vocational 
guidance, most of the good employment agen- 
cies and school bureaus would have an ac- 
curate idea of his worth.” Does Steiner really 
believe that managers of employment agencies 
are especially competent to evaluate a psy- 
chologist’s work! She suggests writing to the 
NVGA “to check the caliber of any vocational 
guidance service.” But in this 1952 book she 
gives the New York address from which 
NVGA moved to Washington in 1949. It is 
hard to reconcile her view that the NVGA 
listing can guide one to a sound vocational 
guidance service with her derision of the de- 
scriptions of members in the APA directory 
and her failure to mention the ABEPP di- 
ploma. She ends the chapter thus: “Choose 
your counsellor with caution.” But “How?” 
seems to be a minimized feature in this “how 
to do it” book. 

Steiner’s purpose is laudable. It is un- 
fortunate that she has written for too many 
audiences and produced such a muddled book. 
Her writing at times seems to show a fine 
understanding of the problems of populariza- 
tion. But the lapses into mixed and unclear 
metaphors (“the sterile vision in which she 
now stews”), the use of professional jargon 
with its semblance of cleverness, the hanging 
of our multi-pieced, and considerably un- 
washed, interprofessional laundry on the pub- 
lic street, all lead the reviewer to conclude 
that this book will not serve its purpose. 

Many common-sense and correct sugges- 
tions for securing good professional help are 
in this book. E.g., work through your family 
doctor; go through a reputable social agency. 
However, they are enmeshed among too many 
peculiarly emphasized points which are ir- 
relevant to people who buy this book as a 
“guide.” The need still persists for a good 
pamphlet giving authentic suggestions to peo- 
ple who want help on problems of personal 
adjustment. In a few dozen Pages one should 
be able to steer people away from quacks and 
toward reputable professional consultants. 

Harold Seashore 

The Psychological Corporation, 

New York, N. Y. 


New Books, Monographs, and Pamphlets 


Books, monographs, and pamphlets for listing and possible review should be sent to Donald G. Paterson 
i Editor, Department of Psychology, University of Minnesota, Minneapolis 14, Minnesota. 


Construction of educational and personnel tests. 
Kenneth L. Bean. New York: McGraw-Hill Book 
Company, 1953. Pp. 231. $4.50. 

Short employment tests. George K. Bennett and 
Marjorie Gelink. New York: The Psychological 
Corporation, 1953. Pp. 10. 

A manual for ‘the state-wide testing programs of 
Minnesota. Ralph F. Berdie, Wilbur L. Layton, 
and Theda Hagenah. Minneapolis: University of 
* Minnesota, Press, 1953. Pp. 86. $1.00. 

Effective use of older workers. Elizabeth Breckin- 


ridge. Chicago: Wilcox & Follett Co., 1953. Pp. 
224. $4.00. 
Company practices in marketing research. Richard 


D. Crisp. New York: American Management As- 
sociation, 1953. Pp. 63. $2.50. 


The psychology of learning. James E. Deese. New 


Personality and psychotherapy. John Dollard and 
Neal E. Miller. New York: McGraw-Hill Book 
Company, 1953. Pp. 483. $5.50. 

Business planning in a changing world. M. Je 
Dooher, Editor. New York: American Manage- 
ment Association, 1953. Pp. 51. $1.25, 

Making the most of your human resources. M. J. 
Dooher, Editor. New York: American Manage- 
ment Association, 1953. Pp, 76. $1.25. 

Making personnel practices and 
M. J. Dooher, Editor, New York: American 
Management Association, 1953. Pp. 64. $1.25, 

Evaluating sales training needs and methods. MM. Th. 
Dooher, Editor. New York: American Manage- 
ment Association, 1953. Pp. 32. $1.25, 

Markets and marketing techniques. M. J. Dooher 
Editor. New York: American Management Assor 
ciation, 1953. Pp. 47. $1.25, 7 

Research methods in the behavioral Sciences, 


programs pay off. 


Festinger and Daniel Katz. New York: The 
Dryden Press, Publishers, 1953, Pp. 660, $5 90. 


How to evaluate students. 


Henrietta Fleck 
ington, Illinois: McKnight cek, Blogit, 


& McKnight. 1953. Pp, 


85. $1. 
Measurement and evaluation in th 
e ele ry 
school. H, A. Greene, A. N, Jorgenson aan 
Gerberich. New York: Longmans, Green : 1 
Co., Inc., 1953. Pp. 617. $5.00, a 
Juvenile delinquency with the MMPI 
Hathaway and Elio D, Monachesi, ie E 
University of Minnesota Press, 1953 P G a 
$3.50. Me aeS Aa 


The education of exceptional children, 
New York: McGraw-Hill Book ie 0, Heck. 
Pp. 513. $6.00. bany 1958, 
Measurement in education. 
York: McGraw-Hill 
533. $5.25. 
Practical guidance 


A. N. Jordan. New 
Pp. 


Book Company, 1953. 


methods. Robert 


New York: McGraw-Hill Book Company, 1953. 
Pp. 320. $4.25. 
Age and achievement. 
ton: Princeton University Pre 
$7.50. g 
Measuring educational achievement. W. J. Michaels 
and M. Ray Karnes. New York: McGraw-Hill 

Book Company, 1953. Pp. 496. 50. 

Saato Gas the white-collar job. Nancy G? 
Morse. Ann Arbor: University of Michigan, a 
vey Research Center, Institute for Social Research, 
1953. Pp. 235. $3.50. 

The influence of instructional 
teacher attitude inventory scores. K 5 
witz. New York: College of the City of 
York, 1953. ` Pp. 19. 

Communication in management. 
field. Chicago: University of Chicago 


Prince- 
Pp. 35 


Harvey C. Lehman. 
1953. 


sets on Minnesota 
William Rabino- 
New 


harles E. Red- 
Char! coed 1953. 


Pp. 290. $3.75. Farke 
The insight test. Helen D. Sargent. ey pe 
Grune & Stratton, Inc., 1953. Pp. 276. Oh Tiffin. 
Industrial psychology. (3rd Ed.) I 559, 
New York: Prentice-Hall, Inc. 1953. fP. > 

$5.00. 


ai in busi- 
osition m 
b New 


1953. 


Profitably using the general staf Dalk 
ness. Lyndall F, Urwick and Ernest 1 ii 
York: American Management Association. 


Pp. 35. $1.25 iS 
ove in 4 rris 
Motivation and morale in industry. pants Mr 
Viteles. New York: W. W. Norton & Compan? 
Inc., 1953. Pp. 510. $9.50. 


eph 


Statistical inference. Helen M. Walker and Jos 53. 


anys 2 
Lev. New York: Henry Holt and Company 
Pp. 510. $6.25. as 

E ving . 

Indirect methods of attitude measurement. Ness 

Weschler and Raymond E. Bernberg. Los # 


University of California, 1953. Pp. 138. ard W. 
Management techniques for foremen. Rie itera 
Wetherwill. New London, Connecticut: $7.50. 
Foremen’s Institute, Inc., 1953. Pp. 177: nmunit 
Community services for older people. Corchicago- 
Project for the Aged, Welfare Council 0! i 


$4.0 


Chicago: Wilcox & Follett Co., 1953. Depart- 

Army personnel tests and measurement: — United 
ment of the Army. Washington, D. © 125. 

States Government Printing Office, 1953. 
55 cents. 

Health and human relations. 
Foundation. New York: The Blakiston 
Inc., 1953. Pp. 270. $6.00. ai 

Group guidance of parents of mentally reat 
dren, 20 cents; Parents’ groups and the i 
of mental retardation, 20 cents; Speaker's 
$1.50. New York: Association for the 
Retarded Children. . Associa” 

The Three R’s for the retarded. National p from 
tion for Retarded Children. 50 cents. OF spor ait 

a Emily Kucirek, 2904 Oberlin Avenue: 
io. 


Macy. J" 


The Josiah Company’ 


